All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] xfs: use generic percpu counters for icsb
@ 2015-02-01 21:42 Dave Chinner
  2015-02-01 21:42 ` [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format Dave Chinner
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Dave Chinner @ 2015-02-01 21:42 UTC (permalink / raw)
  To: xfs

Hi folks,

After listening to Eric swear about the per-cpu counter
implementation we have for the in-core superblock all week, I
decided the best thing to do woul dbe to simply replace them with
generic per-cpu counters. The current icsb counters were implemented
long before we had generic counter infrastructure, and it's remained
that way because if it ain't broke....

Anyway, we do have a couple of issues with the counters to do with
enforcing the maximum inode count on small filesystems. Fixing these
problems is what Eric spend time swearing about.

Anyway, to cut a long story short, there is nothing unique about the
inode counters - neither the allocated inode count nor the free
inode count need to be accurate at zero as they are not used for
ENOSPC enforcement at this limit, and the allocated inode count
doesn't need to be perfectly accurate at the maximum count, either.
Hence we can just replace them with generic per-cpu coutners without
second thoughts.

The free block counter is a little different. We need to be able to
accurately determine zero free blocks due to ENOSPC detection
requirements, and this is where all the complexity came from in the
existing infrastructure. The key technique that the existing
infrastructure uses to be accurate at zero is that it goes back to
a global lock and serialisation as it approaches zero. hence we
trade off scalability for accuracy at ENOSPC.

It turns out we can play the same trick with the generic per-cpu
counter infrastructure. They allow a customised "batch" value, which
is the threshold at which the local per-cpu counter is folded back
into the global counter. By setting this batch to 1 we effectively
serialise all modifications to the counter as any change will be
over the batch fold threshold. Hence we can add a simple check on
the global counter value and switch from large batch values to small
values as we approach the zero threshold.

This patchset has passed xfstests with no regressions, and there are
no performance impacts measurable on my 16p test VM on inode
allocation/freeing intensive workloads, nor on delayed allocation
workloads (which reserve a block at a time and hence trigger
extremely frequent updates) at IO rates of over 1GB/s. it also fixes
the maxicount enforcement issue on small filesystems that started
this off...

SGI:  this is a change that you are going to want to test for
regressions on one of your large machines that has multiple GB/s of
IO bandwidth. I don't expect there to be any problems, but if
there are we might need to tweak batch thresholds based on CPU
count......

This patchset is based on for-next, as it is dependent on the
superblock logging changes that are already queued for the next
cycle. Diffstat is as follows:

 fs/xfs/libxfs/xfs_bmap.c   |  16 +-
 fs/xfs/libxfs/xfs_format.h |  96 +------
 fs/xfs/libxfs/xfs_ialloc.c |   6 +-
 fs/xfs/libxfs/xfs_sb.c     |  43 +--
 fs/xfs/xfs_fsops.c         |  16 +-
 fs/xfs/xfs_iomap.c         |   3 +-
 fs/xfs/xfs_linux.h         |   9 -
 fs/xfs/xfs_log_recover.c   |   5 +-
 fs/xfs/xfs_mount.c         | 730 ++++++-----------------------------------------
 fs/xfs/xfs_mount.h         |  67 +----
 fs/xfs/xfs_rtalloc.c       |   6 +-
 fs/xfs/xfs_super.c         | 101 +++++--
 fs/xfs/xfs_super.h         |  83 ++++++
 fs/xfs/xfs_trans.c         |  19 +-
 14 files changed, 309 insertions(+), 891 deletions(-)

Comments, thoughts?

-Dave.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format
  2015-02-01 21:42 [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Dave Chinner
@ 2015-02-01 21:42 ` Dave Chinner
  2015-02-02  8:41   ` Christoph Hellwig
  2015-02-01 21:43 ` [PATCH 2/5] xfs: use generic percpu counters for inode counter Dave Chinner
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2015-02-01 21:42 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

With the removal of the bitfield based superblock logging code, the
in-core struct xfs_sb is no longer tied to the on-disk format. That
means it can be moved out of xfs_format.h and into a kernel specific
header. This allows the in-core superblock change size, shape and
structure to suit the requirements of the in-memory processing
rather than be constrained by the on-disk layout.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_format.h | 96 ++--------------------------------------------
 fs/xfs/libxfs/xfs_sb.c     |  4 +-
 fs/xfs/xfs_log_recover.c   |  2 +-
 fs/xfs/xfs_mount.c         | 18 ++++-----
 fs/xfs/xfs_mount.h         |  2 +-
 fs/xfs/xfs_rtalloc.c       |  6 +--
 fs/xfs/xfs_super.c         |  6 +--
 fs/xfs/xfs_super.h         | 81 ++++++++++++++++++++++++++++++++++++++
 8 files changed, 103 insertions(+), 112 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 8eb7189..2c8d11f 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -91,98 +91,7 @@ struct xfs_ifork;
 	 XFS_SB_VERSION2_FTYPE)
 
 /*
- * Superblock - in core version.  Must match the ondisk version below.
- * Must be padded to 64 bit alignment.
- */
-typedef struct xfs_sb {
-	__uint32_t	sb_magicnum;	/* magic number == XFS_SB_MAGIC */
-	__uint32_t	sb_blocksize;	/* logical block size, bytes */
-	xfs_rfsblock_t	sb_dblocks;	/* number of data blocks */
-	xfs_rfsblock_t	sb_rblocks;	/* number of realtime blocks */
-	xfs_rtblock_t	sb_rextents;	/* number of realtime extents */
-	uuid_t		sb_uuid;	/* file system unique id */
-	xfs_fsblock_t	sb_logstart;	/* starting block of log if internal */
-	xfs_ino_t	sb_rootino;	/* root inode number */
-	xfs_ino_t	sb_rbmino;	/* bitmap inode for realtime extents */
-	xfs_ino_t	sb_rsumino;	/* summary inode for rt bitmap */
-	xfs_agblock_t	sb_rextsize;	/* realtime extent size, blocks */
-	xfs_agblock_t	sb_agblocks;	/* size of an allocation group */
-	xfs_agnumber_t	sb_agcount;	/* number of allocation groups */
-	xfs_extlen_t	sb_rbmblocks;	/* number of rt bitmap blocks */
-	xfs_extlen_t	sb_logblocks;	/* number of log blocks */
-	__uint16_t	sb_versionnum;	/* header version == XFS_SB_VERSION */
-	__uint16_t	sb_sectsize;	/* volume sector size, bytes */
-	__uint16_t	sb_inodesize;	/* inode size, bytes */
-	__uint16_t	sb_inopblock;	/* inodes per block */
-	char		sb_fname[12];	/* file system name */
-	__uint8_t	sb_blocklog;	/* log2 of sb_blocksize */
-	__uint8_t	sb_sectlog;	/* log2 of sb_sectsize */
-	__uint8_t	sb_inodelog;	/* log2 of sb_inodesize */
-	__uint8_t	sb_inopblog;	/* log2 of sb_inopblock */
-	__uint8_t	sb_agblklog;	/* log2 of sb_agblocks (rounded up) */
-	__uint8_t	sb_rextslog;	/* log2 of sb_rextents */
-	__uint8_t	sb_inprogress;	/* mkfs is in progress, don't mount */
-	__uint8_t	sb_imax_pct;	/* max % of fs for inode space */
-					/* statistics */
-	/*
-	 * These fields must remain contiguous.  If you really
-	 * want to change their layout, make sure you fix the
-	 * code in xfs_trans_apply_sb_deltas().
-	 */
-	__uint64_t	sb_icount;	/* allocated inodes */
-	__uint64_t	sb_ifree;	/* free inodes */
-	__uint64_t	sb_fdblocks;	/* free data blocks */
-	__uint64_t	sb_frextents;	/* free realtime extents */
-	/*
-	 * End contiguous fields.
-	 */
-	xfs_ino_t	sb_uquotino;	/* user quota inode */
-	xfs_ino_t	sb_gquotino;	/* group quota inode */
-	__uint16_t	sb_qflags;	/* quota flags */
-	__uint8_t	sb_flags;	/* misc. flags */
-	__uint8_t	sb_shared_vn;	/* shared version number */
-	xfs_extlen_t	sb_inoalignmt;	/* inode chunk alignment, fsblocks */
-	__uint32_t	sb_unit;	/* stripe or raid unit */
-	__uint32_t	sb_width;	/* stripe or raid width */
-	__uint8_t	sb_dirblklog;	/* log2 of dir block size (fsbs) */
-	__uint8_t	sb_logsectlog;	/* log2 of the log sector size */
-	__uint16_t	sb_logsectsize;	/* sector size for the log, bytes */
-	__uint32_t	sb_logsunit;	/* stripe unit size for the log */
-	__uint32_t	sb_features2;	/* additional feature bits */
-
-	/*
-	 * bad features2 field as a result of failing to pad the sb structure to
-	 * 64 bits. Some machines will be using this field for features2 bits.
-	 * Easiest just to mark it bad and not use it for anything else.
-	 *
-	 * This is not kept up to date in memory; it is always overwritten by
-	 * the value in sb_features2 when formatting the incore superblock to
-	 * the disk buffer.
-	 */
-	__uint32_t	sb_bad_features2;
-
-	/* version 5 superblock fields start here */
-
-	/* feature masks */
-	__uint32_t	sb_features_compat;
-	__uint32_t	sb_features_ro_compat;
-	__uint32_t	sb_features_incompat;
-	__uint32_t	sb_features_log_incompat;
-
-	__uint32_t	sb_crc;		/* superblock crc */
-	__uint32_t	sb_pad;
-
-	xfs_ino_t	sb_pquotino;	/* project quota inode */
-	xfs_lsn_t	sb_lsn;		/* last write sequence */
-
-	/* must be padded to 64 bit alignment */
-} xfs_sb_t;
-
-#define XFS_SB_CRC_OFF		offsetof(struct xfs_sb, sb_crc)
-
-/*
- * Superblock - on disk version.  Must match the in core version above.
- * Must be padded to 64 bit alignment.
+ * Superblock - on disk version. Must be padded to 64 bit alignment.
  */
 typedef struct xfs_dsb {
 	__be32		sb_magicnum;	/* magic number == XFS_SB_MAGIC */
@@ -255,6 +164,7 @@ typedef struct xfs_dsb {
 	__be32		sb_features_incompat;
 	__be32		sb_features_log_incompat;
 
+#define XFS_SB_CRC_OFF		offsetof(struct xfs_dsb, sb_crc)
 	__le32		sb_crc;		/* superblock crc */
 	__be32		sb_pad;
 
@@ -562,7 +472,7 @@ static inline int xfs_sb_version_hasftype(struct xfs_sb *sbp)
 		 (sbp->sb_features2 & XFS_SB_VERSION2_FTYPE));
 }
 
-static inline int xfs_sb_version_hasfinobt(xfs_sb_t *sbp)
+static inline int xfs_sb_version_hasfinobt(struct xfs_sb *sbp)
 {
 	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5) &&
 		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_FINOBT);
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index b0a5fe9..4cf335b 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -107,7 +107,7 @@ xfs_perag_put(
 STATIC int
 xfs_mount_validate_sb(
 	xfs_mount_t	*mp,
-	xfs_sb_t	*sbp,
+	struct xfs_sb	*sbp,
 	bool		check_inprogress,
 	bool		check_version)
 {
@@ -714,7 +714,7 @@ xfs_initialize_perag_data(
 {
 	xfs_agnumber_t	index;
 	xfs_perag_t	*pag;
-	xfs_sb_t	*sbp = &mp->m_sb;
+	struct xfs_sb	*sbp = &mp->m_sb;
 	uint64_t	ifree = 0;
 	uint64_t	ialloc = 0;
 	uint64_t	bfree = 0;
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index a5a945f..81d7f24 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -4410,7 +4410,7 @@ xlog_do_recover(
 {
 	int		error;
 	xfs_buf_t	*bp;
-	xfs_sb_t	*sbp;
+	struct xfs_sb	*sbp;
 
 	/*
 	 * First replay the images in the log.
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 4fa80e6..6015f54 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -166,7 +166,7 @@ xfs_free_perag(
  */
 int
 xfs_sb_validate_fsb_count(
-	xfs_sb_t	*sbp,
+	struct xfs_sb	*sbp,
 	__uint64_t	nblocks)
 {
 	ASSERT(PAGE_SHIFT >= sbp->sb_blocklog);
@@ -189,7 +189,7 @@ xfs_initialize_perag(
 	xfs_perag_t	*pag;
 	xfs_agino_t	agino;
 	xfs_ino_t	ino;
-	xfs_sb_t	*sbp = &mp->m_sb;
+	struct xfs_sb	*sbp = &mp->m_sb;
 	int		error = -ENOMEM;
 
 	/*
@@ -368,7 +368,7 @@ release_buf:
 STATIC int
 xfs_update_alignment(xfs_mount_t *mp)
 {
-	xfs_sb_t	*sbp = &(mp->m_sb);
+	struct xfs_sb	*sbp = &(mp->m_sb);
 
 	if (mp->m_dalign) {
 		/*
@@ -434,7 +434,7 @@ xfs_update_alignment(xfs_mount_t *mp)
 STATIC void
 xfs_set_maxicount(xfs_mount_t *mp)
 {
-	xfs_sb_t	*sbp = &(mp->m_sb);
+	struct xfs_sb	*sbp = &(mp->m_sb);
 	__uint64_t	icount;
 
 	if (sbp->sb_imax_pct) {
@@ -461,7 +461,7 @@ xfs_set_maxicount(xfs_mount_t *mp)
 STATIC void
 xfs_set_rw_sizes(xfs_mount_t *mp)
 {
-	xfs_sb_t	*sbp = &(mp->m_sb);
+	struct xfs_sb	*sbp = &(mp->m_sb);
 	int		readio_log, writeio_log;
 
 	if (!(mp->m_flags & XFS_MOUNT_DFLT_IOSIZE)) {
@@ -630,7 +630,7 @@ int
 xfs_mountfs(
 	xfs_mount_t	*mp)
 {
-	xfs_sb_t	*sbp = &(mp->m_sb);
+	struct xfs_sb	*sbp = &(mp->m_sb);
 	xfs_inode_t	*rip;
 	__uint64_t	resblks;
 	uint		quotamount = 0;
@@ -642,9 +642,9 @@ xfs_mountfs(
 	/*
 	 * Check for a mismatched features2 values.  Older kernels read & wrote
 	 * into the wrong sb offset for sb_features2 on some platforms due to
-	 * xfs_sb_t not being 64bit size aligned when sb_features2 was added,
-	 * which made older superblock reading/writing routines swap it as a
-	 * 64-bit value.
+	 * struct xfs_sb not being 64bit size aligned when sb_features2 was
+	 * added, which made older superblock reading/writing routines swap it
+	 * as a 64-bit value.
 	 *
 	 * For backwards compatibility, we make both slots equal.
 	 *
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index a5b2ff8..9d62134 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -81,7 +81,7 @@ typedef struct xfs_mount {
 	struct super_block	*m_super;
 	xfs_tid_t		m_tid;		/* next unused tid for fs */
 	struct xfs_ail		*m_ail;		/* fs active log item list */
-	xfs_sb_t		m_sb;		/* copy of fs superblock */
+	struct xfs_sb		m_sb;		/* copy of fs superblock */
 	spinlock_t		m_sb_lock;	/* sb counter lock */
 	struct xfs_buf		*m_sb_bp;	/* buffer for superblock */
 	char			*m_fsname;	/* filesystem name */
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index f2079b6..7e15ec9 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -897,10 +897,10 @@ xfs_growfs_rt(
 	xfs_extlen_t	nrsumblocks;	/* new number of summary blocks */
 	uint		nrsumlevels;	/* new rt summary levels */
 	uint		nrsumsize;	/* new size of rt summary, bytes */
-	xfs_sb_t	*nsbp;		/* new superblock */
+	struct xfs_sb	*nsbp;		/* new superblock */
 	xfs_extlen_t	rbmblocks;	/* current number of rt bitmap blocks */
 	xfs_extlen_t	rsumblocks;	/* current number of rt summary blks */
-	xfs_sb_t	*sbp;		/* old superblock */
+	struct xfs_sb	*sbp;		/* old superblock */
 	xfs_fsblock_t	sumbno;		/* summary block number */
 
 	sbp = &mp->m_sb;
@@ -1227,7 +1227,7 @@ xfs_rtmount_inodes(
 	xfs_mount_t	*mp)		/* file system mount structure */
 {
 	int		error;		/* error return value */
-	xfs_sb_t	*sbp;
+	struct xfs_sb	*sbp;
 
 	sbp = &mp->m_sb;
 	if (sbp->sb_rbmino == NULLFSINO)
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 40d2ac5..7583044 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -602,7 +602,7 @@ xfs_set_inode32(struct xfs_mount *mp, xfs_agnumber_t agcount)
 {
 	xfs_agnumber_t	index = 0;
 	xfs_agnumber_t	maxagi = 0;
-	xfs_sb_t	*sbp = &mp->m_sb;
+	struct xfs_sb	*sbp = &mp->m_sb;
 	xfs_agnumber_t	max_metadata;
 	xfs_agino_t	agino;
 	xfs_ino_t	ino;
@@ -1084,7 +1084,7 @@ xfs_fs_statfs(
 	struct kstatfs		*statp)
 {
 	struct xfs_mount	*mp = XFS_M(dentry->d_sb);
-	xfs_sb_t		*sbp = &mp->m_sb;
+	struct xfs_sb		*sbp = &mp->m_sb;
 	struct xfs_inode	*ip = XFS_I(dentry->d_inode);
 	__uint64_t		fakeinos, id;
 	xfs_extlen_t		lsize;
@@ -1198,7 +1198,7 @@ xfs_fs_remount(
 	char			*options)
 {
 	struct xfs_mount	*mp = XFS_M(sb);
-	xfs_sb_t		*sbp = &mp->m_sb;
+	struct xfs_sb		*sbp = &mp->m_sb;
 	substring_t		args[MAX_OPT_ARGS];
 	char			*p;
 	int			error;
diff --git a/fs/xfs/xfs_super.h b/fs/xfs/xfs_super.h
index 2b830c2..a02236b 100644
--- a/fs/xfs/xfs_super.h
+++ b/fs/xfs/xfs_super.h
@@ -61,6 +61,87 @@ struct xfs_mount;
 struct xfs_buftarg;
 struct block_device;
 
+/*
+ * Superblock - in core version.  This does not have ot match the size and shape
+ * of the on-disk superblock, but must contain all the fields that we use in the
+ * on-disk superblock.
+ */
+struct xfs_sb {
+	__uint32_t	sb_magicnum;	/* magic number == XFS_SB_MAGIC */
+	__uint32_t	sb_blocksize;	/* logical block size, bytes */
+	xfs_rfsblock_t	sb_dblocks;	/* number of data blocks */
+	xfs_rfsblock_t	sb_rblocks;	/* number of realtime blocks */
+	xfs_rtblock_t	sb_rextents;	/* number of realtime extents */
+	uuid_t		sb_uuid;	/* file system unique id */
+	xfs_fsblock_t	sb_logstart;	/* starting block of log if internal */
+	xfs_ino_t	sb_rootino;	/* root inode number */
+	xfs_ino_t	sb_rbmino;	/* bitmap inode for realtime extents */
+	xfs_ino_t	sb_rsumino;	/* summary inode for rt bitmap */
+	xfs_agblock_t	sb_rextsize;	/* realtime extent size, blocks */
+	xfs_agblock_t	sb_agblocks;	/* size of an allocation group */
+	xfs_agnumber_t	sb_agcount;	/* number of allocation groups */
+	xfs_extlen_t	sb_rbmblocks;	/* number of rt bitmap blocks */
+	xfs_extlen_t	sb_logblocks;	/* number of log blocks */
+	__uint16_t	sb_versionnum;	/* header version == XFS_SB_VERSION */
+	__uint16_t	sb_sectsize;	/* volume sector size, bytes */
+	__uint16_t	sb_inodesize;	/* inode size, bytes */
+	__uint16_t	sb_inopblock;	/* inodes per block */
+	char		sb_fname[12];	/* file system name */
+	__uint8_t	sb_blocklog;	/* log2 of sb_blocksize */
+	__uint8_t	sb_sectlog;	/* log2 of sb_sectsize */
+	__uint8_t	sb_inodelog;	/* log2 of sb_inodesize */
+	__uint8_t	sb_inopblog;	/* log2 of sb_inopblock */
+	__uint8_t	sb_agblklog;	/* log2 of sb_agblocks (rounded up) */
+	__uint8_t	sb_rextslog;	/* log2 of sb_rextents */
+	__uint8_t	sb_inprogress;	/* mkfs is in progress, don't mount */
+	__uint8_t	sb_imax_pct;	/* max % of fs for inode space */
+					/* statistics */
+	__uint64_t	sb_icount;	/* allocated inodes */
+	__uint64_t	sb_ifree;	/* free inodes */
+	__uint64_t	sb_fdblocks;	/* free data blocks */
+	__uint64_t	sb_frextents;	/* free realtime extents */
+	xfs_ino_t	sb_uquotino;	/* user quota inode */
+	xfs_ino_t	sb_gquotino;	/* group quota inode */
+	__uint16_t	sb_qflags;	/* quota flags */
+	__uint8_t	sb_flags;	/* misc. flags */
+	__uint8_t	sb_shared_vn;	/* shared version number */
+	xfs_extlen_t	sb_inoalignmt;	/* inode chunk alignment, fsblocks */
+	__uint32_t	sb_unit;	/* stripe or raid unit */
+	__uint32_t	sb_width;	/* stripe or raid width */
+	__uint8_t	sb_dirblklog;	/* log2 of dir block size (fsbs) */
+	__uint8_t	sb_logsectlog;	/* log2 of the log sector size */
+	__uint16_t	sb_logsectsize;	/* sector size for the log, bytes */
+	__uint32_t	sb_logsunit;	/* stripe unit size for the log */
+	__uint32_t	sb_features2;	/* additional feature bits */
+
+	/*
+	 * bad features2 field as a result of failing to pad the sb structure to
+	 * 64 bits. Some machines will be using this field for features2 bits.
+	 * Easiest just to mark it bad and not use it for anything else.
+	 *
+	 * This is not kept up to date in memory; it is always overwritten by
+	 * the value in sb_features2 when formatting the incore superblock to
+	 * the disk buffer.
+	 */
+	__uint32_t	sb_bad_features2;
+
+	/* version 5 superblock fields start here */
+
+	/* feature masks */
+	__uint32_t	sb_features_compat;
+	__uint32_t	sb_features_ro_compat;
+	__uint32_t	sb_features_incompat;
+	__uint32_t	sb_features_log_incompat;
+
+	__uint32_t	sb_crc;		/* superblock crc */
+	__uint32_t	sb_pad;
+
+	xfs_ino_t	sb_pquotino;	/* project quota inode */
+	xfs_lsn_t	sb_lsn;		/* last write sequence */
+
+	/* must be padded to 64 bit alignment */
+};
+
 extern __uint64_t xfs_max_file_offset(unsigned int);
 
 extern void xfs_flush_inodes(struct xfs_mount *mp);
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 2/5] xfs: use generic percpu counters for inode counter
  2015-02-01 21:42 [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Dave Chinner
  2015-02-01 21:42 ` [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format Dave Chinner
@ 2015-02-01 21:43 ` Dave Chinner
  2015-02-02 16:44   ` Christoph Hellwig
  2015-02-01 21:43 ` [PATCH 3/5] xfs: use generic percpu counters for free " Dave Chinner
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2015-02-01 21:43 UTC (permalink / raw)
  To: xfs

XFS has hand-rolled per-cpu counters for the superblock since before
there was any generic implementation. There are some warts around
the  use of them for the inode counter as the hand rolled counter is
designed to be accurate at zero, but has no specific accurracy at
any other value. This design causes problems for the maximum inode
count threshold enforcement, as there is no trigger that balances
the counters as they get close tothe maximum threshold.

Instead of designing new triggers for balancing, just replace the
handrolled per-cpu counter with a generic counter directly in the
incore superblock. This enables us to update the counter throught eh
normal superblock modification funtions, rather than having to go
around them and then fall back to the normal modificaiton functions
if the per-cpu nature of the counter has been turned off.

The result is a much simpler counter implementation that can be used
accurately across it's entire range of values rather than just near
to zero values.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_ialloc.c |  6 +++--
 fs/xfs/libxfs/xfs_sb.c     | 10 +++++---
 fs/xfs/xfs_fsops.c         |  3 ++-
 fs/xfs/xfs_mount.c         | 58 +++++++++++++++++-----------------------------
 fs/xfs/xfs_mount.h         |  1 -
 fs/xfs/xfs_super.c         |  8 ++++---
 fs/xfs/xfs_super.h         |  4 +++-
 fs/xfs/xfs_trans.c         |  5 ++--
 8 files changed, 44 insertions(+), 51 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_ialloc.c b/fs/xfs/libxfs/xfs_ialloc.c
index 116ef1d..95a1762 100644
--- a/fs/xfs/libxfs/xfs_ialloc.c
+++ b/fs/xfs/libxfs/xfs_ialloc.c
@@ -376,7 +376,8 @@ xfs_ialloc_ag_alloc(
 	 */
 	newlen = args.mp->m_ialloc_inos;
 	if (args.mp->m_maxicount &&
-	    args.mp->m_sb.sb_icount + newlen > args.mp->m_maxicount)
+	    percpu_counter_read(&args.mp->m_sb.sb_icount) + newlen >
+							args.mp->m_maxicount)
 		return -ENOSPC;
 	args.minlen = args.maxlen = args.mp->m_ialloc_blks;
 	/*
@@ -1340,7 +1341,8 @@ xfs_dialloc(
 	 * inode.
 	 */
 	if (mp->m_maxicount &&
-	    mp->m_sb.sb_icount + mp->m_ialloc_inos > mp->m_maxicount) {
+	    percpu_counter_read(&mp->m_sb.sb_icount) + mp->m_ialloc_inos >
+							mp->m_maxicount) {
 		noroom = 1;
 		okalloc = 0;
 	}
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index 4cf335b..7bfa527 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -357,7 +357,8 @@ __xfs_sb_from_disk(
 	to->sb_rextslog = from->sb_rextslog;
 	to->sb_inprogress = from->sb_inprogress;
 	to->sb_imax_pct = from->sb_imax_pct;
-	to->sb_icount = be64_to_cpu(from->sb_icount);
+	if (percpu_counter_initialized(&to->sb_icount))
+		percpu_counter_set(&to->sb_icount, be64_to_cpu(from->sb_icount));
 	to->sb_ifree = be64_to_cpu(from->sb_ifree);
 	to->sb_fdblocks = be64_to_cpu(from->sb_fdblocks);
 	to->sb_frextents = be64_to_cpu(from->sb_frextents);
@@ -492,7 +493,7 @@ xfs_sb_to_disk(
 	to->sb_rextslog = from->sb_rextslog;
 	to->sb_inprogress = from->sb_inprogress;
 	to->sb_imax_pct = from->sb_imax_pct;
-	to->sb_icount = cpu_to_be64(from->sb_icount);
+	to->sb_icount = cpu_to_be64(percpu_counter_sum(&from->sb_icount));
 	to->sb_ifree = cpu_to_be64(from->sb_ifree);
 	to->sb_fdblocks = cpu_to_be64(from->sb_fdblocks);
 	to->sb_frextents = cpu_to_be64(from->sb_frextents);
@@ -537,6 +538,9 @@ xfs_sb_verify(
 	struct xfs_mount *mp = bp->b_target->bt_mount;
 	struct xfs_sb	sb;
 
+	/* don't need to validate icount here */
+	sb.sb_icount.counters = NULL;
+
 	/*
 	 * Use call variant which doesn't convert quota flags from disk 
 	 * format, because xfs_mount_validate_sb checks the on-disk flags.
@@ -748,7 +752,7 @@ xfs_initialize_perag_data(
 	 */
 	spin_lock(&mp->m_sb_lock);
 	sbp->sb_ifree = ifree;
-	sbp->sb_icount = ialloc;
+	percpu_counter_set(&sbp->sb_icount, ialloc);
 	sbp->sb_fdblocks = bfree + bfreelst + btree;
 	spin_unlock(&mp->m_sb_lock);
 
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index f711452..9cc34d2 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -631,11 +631,12 @@ xfs_fs_counts(
 	xfs_fsop_counts_t	*cnt)
 {
 	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
+	cnt->allocino = percpu_counter_read_positive(&mp->m_sb.sb_icount);
+
 	spin_lock(&mp->m_sb_lock);
 	cnt->freedata = mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
 	cnt->freertx = mp->m_sb.sb_frextents;
 	cnt->freeino = mp->m_sb.sb_ifree;
-	cnt->allocino = mp->m_sb.sb_icount;
 	spin_unlock(&mp->m_sb_lock);
 	return 0;
 }
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 6015f54..df5ec55 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1127,13 +1127,13 @@ xfs_mod_incore_sb_unlocked(
 	 */
 	switch (field) {
 	case XFS_SBS_ICOUNT:
-		lcounter = (long long)mp->m_sb.sb_icount;
-		lcounter += delta;
-		if (lcounter < 0) {
+		/* deltas are +/-64, hence the large batch size of 128. */
+		__percpu_counter_add(&mp->m_sb.sb_icount, delta, 128);
+		if (percpu_counter_compare(&mp->m_sb.sb_icount, 0) < 0) {
 			ASSERT(0);
+			percpu_counter_add(&mp->m_sb.sb_icount, -delta);
 			return -EINVAL;
 		}
-		mp->m_sb.sb_icount = lcounter;
 		return 0;
 	case XFS_SBS_IFREE:
 		lcounter = (long long)mp->m_sb.sb_ifree;
@@ -1288,8 +1288,11 @@ xfs_mod_incore_sb(
 	int			status;
 
 #ifdef HAVE_PERCPU_SB
-	ASSERT(field < XFS_SBS_ICOUNT || field > XFS_SBS_FDBLOCKS);
+	ASSERT(field < XFS_SBS_IFREE || field > XFS_SBS_FDBLOCKS);
 #endif
+	if (field == XFS_SBS_ICOUNT)
+		return xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
+
 	spin_lock(&mp->m_sb_lock);
 	status = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
 	spin_unlock(&mp->m_sb_lock);
@@ -1492,7 +1495,6 @@ xfs_icsb_cpu_notify(
 	case CPU_ONLINE:
 	case CPU_ONLINE_FROZEN:
 		xfs_icsb_lock(mp);
-		xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0);
 		xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
 		xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
 		xfs_icsb_unlock(mp);
@@ -1504,17 +1506,14 @@ xfs_icsb_cpu_notify(
 		 * re-enable the counters. */
 		xfs_icsb_lock(mp);
 		spin_lock(&mp->m_sb_lock);
-		xfs_icsb_disable_counter(mp, XFS_SBS_ICOUNT);
 		xfs_icsb_disable_counter(mp, XFS_SBS_IFREE);
 		xfs_icsb_disable_counter(mp, XFS_SBS_FDBLOCKS);
 
-		mp->m_sb.sb_icount += cntp->icsb_icount;
 		mp->m_sb.sb_ifree += cntp->icsb_ifree;
 		mp->m_sb.sb_fdblocks += cntp->icsb_fdblocks;
 
 		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
 
-		xfs_icsb_balance_counter_locked(mp, XFS_SBS_ICOUNT, 0);
 		xfs_icsb_balance_counter_locked(mp, XFS_SBS_IFREE, 0);
 		xfs_icsb_balance_counter_locked(mp, XFS_SBS_FDBLOCKS, 0);
 		spin_unlock(&mp->m_sb_lock);
@@ -1533,9 +1532,15 @@ xfs_icsb_init_counters(
 	xfs_icsb_cnts_t *cntp;
 	int		i;
 
+	i = percpu_counter_init(&mp->m_sb.sb_icount, 0, GFP_KERNEL);
+	if (i)
+		return ENOMEM;
+
 	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
-	if (mp->m_sb_cnts == NULL)
+	if (!mp->m_sb_cnts) {
+		percpu_counter_destroy(&mp->m_sb.sb_icount);
 		return -ENOMEM;
+	}
 
 	for_each_online_cpu(i) {
 		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
@@ -1569,7 +1574,6 @@ xfs_icsb_reinit_counters(
 	 * initial balance kicks us off correctly
 	 */
 	mp->m_icsb_counters = -1;
-	xfs_icsb_balance_counter(mp, XFS_SBS_ICOUNT, 0);
 	xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
 	xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
 	xfs_icsb_unlock(mp);
@@ -1583,6 +1587,9 @@ xfs_icsb_destroy_counters(
 		unregister_hotcpu_notifier(&mp->m_icsb_notifier);
 		free_percpu(mp->m_sb_cnts);
 	}
+
+	percpu_counter_destroy(&mp->m_sb.sb_icount);
+
 	mutex_destroy(&mp->m_icsb_mutex);
 }
 
@@ -1645,7 +1652,6 @@ xfs_icsb_count(
 
 	for_each_online_cpu(i) {
 		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		cnt->icsb_icount += cntp->icsb_icount;
 		cnt->icsb_ifree += cntp->icsb_ifree;
 		cnt->icsb_fdblocks += cntp->icsb_fdblocks;
 	}
@@ -1659,7 +1665,7 @@ xfs_icsb_counter_disabled(
 	xfs_mount_t	*mp,
 	xfs_sb_field_t	field)
 {
-	ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS));
+	ASSERT((field >= XFS_SBS_IFREE) && (field <= XFS_SBS_FDBLOCKS));
 	return test_bit(field, &mp->m_icsb_counters);
 }
 
@@ -1670,7 +1676,7 @@ xfs_icsb_disable_counter(
 {
 	xfs_icsb_cnts_t	cnt;
 
-	ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS));
+	ASSERT((field >= XFS_SBS_IFREE) && (field <= XFS_SBS_FDBLOCKS));
 
 	/*
 	 * If we are already disabled, then there is nothing to do
@@ -1689,9 +1695,6 @@ xfs_icsb_disable_counter(
 
 		xfs_icsb_count(mp, &cnt, XFS_ICSB_LAZY_COUNT);
 		switch(field) {
-		case XFS_SBS_ICOUNT:
-			mp->m_sb.sb_icount = cnt.icsb_icount;
-			break;
 		case XFS_SBS_IFREE:
 			mp->m_sb.sb_ifree = cnt.icsb_ifree;
 			break;
@@ -1716,15 +1719,12 @@ xfs_icsb_enable_counter(
 	xfs_icsb_cnts_t	*cntp;
 	int		i;
 
-	ASSERT((field >= XFS_SBS_ICOUNT) && (field <= XFS_SBS_FDBLOCKS));
+	ASSERT((field >= XFS_SBS_IFREE) && (field <= XFS_SBS_FDBLOCKS));
 
 	xfs_icsb_lock_all_counters(mp);
 	for_each_online_cpu(i) {
 		cntp = per_cpu_ptr(mp->m_sb_cnts, i);
 		switch (field) {
-		case XFS_SBS_ICOUNT:
-			cntp->icsb_icount = count + resid;
-			break;
 		case XFS_SBS_IFREE:
 			cntp->icsb_ifree = count + resid;
 			break;
@@ -1750,8 +1750,6 @@ xfs_icsb_sync_counters_locked(
 
 	xfs_icsb_count(mp, &cnt, flags);
 
-	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_ICOUNT))
-		mp->m_sb.sb_icount = cnt.icsb_icount;
 	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_IFREE))
 		mp->m_sb.sb_ifree = cnt.icsb_ifree;
 	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_FDBLOCKS))
@@ -1805,12 +1803,6 @@ xfs_icsb_balance_counter_locked(
 
 	/* update counters  - first CPU gets residual*/
 	switch (field) {
-	case XFS_SBS_ICOUNT:
-		count = mp->m_sb.sb_icount;
-		resid = do_div(count, weight);
-		if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE))
-			return;
-		break;
 	case XFS_SBS_IFREE:
 		count = mp->m_sb.sb_ifree;
 		resid = do_div(count, weight);
@@ -1871,14 +1863,6 @@ again:
 	}
 
 	switch (field) {
-	case XFS_SBS_ICOUNT:
-		lcounter = icsbp->icsb_icount;
-		lcounter += delta;
-		if (unlikely(lcounter < 0))
-			goto balance_counter;
-		icsbp->icsb_icount = lcounter;
-		break;
-
 	case XFS_SBS_IFREE:
 		lcounter = icsbp->icsb_ifree;
 		lcounter += delta;
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 9d62134..9499a8f 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -41,7 +41,6 @@ struct xfs_da_geometry;
 typedef struct xfs_icsb_cnts {
 	uint64_t	icsb_fdblocks;
 	uint64_t	icsb_ifree;
-	uint64_t	icsb_icount;
 	unsigned long	icsb_flags;
 } xfs_icsb_cnts_t;
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 7583044..408e2fe 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1087,6 +1087,7 @@ xfs_fs_statfs(
 	struct xfs_sb		*sbp = &mp->m_sb;
 	struct xfs_inode	*ip = XFS_I(dentry->d_inode);
 	__uint64_t		fakeinos, id;
+	__uint64_t		sb_icount;
 	xfs_extlen_t		lsize;
 	__int64_t		ffree;
 
@@ -1098,6 +1099,7 @@ xfs_fs_statfs(
 	statp->f_fsid.val[1] = (u32)(id >> 32);
 
 	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
+	sb_icount = percpu_counter_sum(&sbp->sb_icount);
 
 	spin_lock(&mp->m_sb_lock);
 	statp->f_bsize = sbp->sb_blocksize;
@@ -1106,15 +1108,15 @@ xfs_fs_statfs(
 	statp->f_bfree = statp->f_bavail =
 				sbp->sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
 	fakeinos = statp->f_bfree << sbp->sb_inopblog;
-	statp->f_files =
-	    MIN(sbp->sb_icount + fakeinos, (__uint64_t)XFS_MAXINUMBER);
+	statp->f_files = MIN(sb_icount + fakeinos,
+			     (__uint64_t)XFS_MAXINUMBER);
 	if (mp->m_maxicount)
 		statp->f_files = min_t(typeof(statp->f_files),
 					statp->f_files,
 					mp->m_maxicount);
 
 	/* make sure statp->f_ffree does not underflow */
-	ffree = statp->f_files - (sbp->sb_icount - sbp->sb_ifree);
+	ffree = statp->f_files - (sb_icount - sbp->sb_ifree);
 	statp->f_ffree = max_t(__int64_t, ffree, 0);
 
 	spin_unlock(&mp->m_sb_lock);
diff --git a/fs/xfs/xfs_super.h b/fs/xfs/xfs_super.h
index a02236b..fa5603c 100644
--- a/fs/xfs/xfs_super.h
+++ b/fs/xfs/xfs_super.h
@@ -96,7 +96,9 @@ struct xfs_sb {
 	__uint8_t	sb_inprogress;	/* mkfs is in progress, don't mount */
 	__uint8_t	sb_imax_pct;	/* max % of fs for inode space */
 					/* statistics */
-	__uint64_t	sb_icount;	/* allocated inodes */
+
+	struct percpu_counter	sb_icount;	/* allocated inodes */
+
 	__uint64_t	sb_ifree;	/* free inodes */
 	__uint64_t	sb_fdblocks;	/* free data blocks */
 	__uint64_t	sb_frextents;	/* free realtime extents */
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index eb90cd5..d78b0ae 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -554,8 +554,7 @@ xfs_trans_unreserve_and_mod_sb(
 	}
 
 	if (idelta) {
-		error = xfs_icsb_modify_counters(mp, XFS_SBS_ICOUNT,
-						 idelta, rsvd);
+		error = xfs_mod_incore_sb(mp, XFS_SBS_ICOUNT, idelta, rsvd);
 		if (error)
 			goto out_undo_fdblocks;
 	}
@@ -634,7 +633,7 @@ out_undo_ifreecount:
 		xfs_icsb_modify_counters(mp, XFS_SBS_IFREE, -ifreedelta, rsvd);
 out_undo_icount:
 	if (idelta)
-		xfs_icsb_modify_counters(mp, XFS_SBS_ICOUNT, -idelta, rsvd);
+		xfs_mod_incore_sb(mp, XFS_SBS_ICOUNT, -idelta, rsvd);
 out_undo_fdblocks:
 	if (blkdelta)
 		xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS, -blkdelta, rsvd);
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 3/5] xfs: use generic percpu counters for free inode counter
  2015-02-01 21:42 [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Dave Chinner
  2015-02-01 21:42 ` [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format Dave Chinner
  2015-02-01 21:43 ` [PATCH 2/5] xfs: use generic percpu counters for inode counter Dave Chinner
@ 2015-02-01 21:43 ` Dave Chinner
  2015-02-02 17:10   ` Brian Foster
  2015-02-01 21:43 ` [PATCH 4/5] xfs: use generic percpu counters for free block counter Dave Chinner
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2015-02-01 21:43 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

XFS has hand-rolled per-cpu counters for the superblock since before
there was any generic implementation. The free inode counter is not
used for any limit enforcement - the per-AG free inode counters are
used during allocation to determine if there are inode available for
allocation.

Hence we don't need any of the complexity of the hand-rolled
counters and we can simply replace them with generic per-cpu
counters similar to the inode counter.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_sb.c |  8 ++++---
 fs/xfs/xfs_fsops.c     |  2 +-
 fs/xfs/xfs_mount.c     | 62 +++++++++++++++++---------------------------------
 fs/xfs/xfs_super.c     |  4 +++-
 fs/xfs/xfs_super.h     |  2 +-
 fs/xfs/xfs_trans.c     |  5 ++--
 6 files changed, 33 insertions(+), 50 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index 7bfa527..42e5c89 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -359,7 +359,8 @@ __xfs_sb_from_disk(
 	to->sb_imax_pct = from->sb_imax_pct;
 	if (percpu_counter_initialized(&to->sb_icount))
 		percpu_counter_set(&to->sb_icount, be64_to_cpu(from->sb_icount));
-	to->sb_ifree = be64_to_cpu(from->sb_ifree);
+	if (percpu_counter_initialized(&to->sb_icount))
+		percpu_counter_set(&to->sb_ifree, be64_to_cpu(from->sb_ifree));
 	to->sb_fdblocks = be64_to_cpu(from->sb_fdblocks);
 	to->sb_frextents = be64_to_cpu(from->sb_frextents);
 	to->sb_uquotino = be64_to_cpu(from->sb_uquotino);
@@ -494,7 +495,7 @@ xfs_sb_to_disk(
 	to->sb_inprogress = from->sb_inprogress;
 	to->sb_imax_pct = from->sb_imax_pct;
 	to->sb_icount = cpu_to_be64(percpu_counter_sum(&from->sb_icount));
-	to->sb_ifree = cpu_to_be64(from->sb_ifree);
+	to->sb_ifree = cpu_to_be64(percpu_counter_sum(&from->sb_ifree));
 	to->sb_fdblocks = cpu_to_be64(from->sb_fdblocks);
 	to->sb_frextents = cpu_to_be64(from->sb_frextents);
 
@@ -540,6 +541,7 @@ xfs_sb_verify(
 
 	/* don't need to validate icount here */
 	sb.sb_icount.counters = NULL;
+	sb.sb_ifree.counters = NULL;
 
 	/*
 	 * Use call variant which doesn't convert quota flags from disk 
@@ -751,7 +753,7 @@ xfs_initialize_perag_data(
 	 * Overwrite incore superblock counters with just-read data
 	 */
 	spin_lock(&mp->m_sb_lock);
-	sbp->sb_ifree = ifree;
+	percpu_counter_set(&sbp->sb_ifree, ifree);
 	percpu_counter_set(&sbp->sb_icount, ialloc);
 	sbp->sb_fdblocks = bfree + bfreelst + btree;
 	spin_unlock(&mp->m_sb_lock);
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 9cc34d2..619a9f3 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -632,11 +632,11 @@ xfs_fs_counts(
 {
 	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
 	cnt->allocino = percpu_counter_read_positive(&mp->m_sb.sb_icount);
+	cnt->freeino = percpu_counter_read_positive(&mp->m_sb.sb_ifree);
 
 	spin_lock(&mp->m_sb_lock);
 	cnt->freedata = mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
 	cnt->freertx = mp->m_sb.sb_frextents;
-	cnt->freeino = mp->m_sb.sb_ifree;
 	spin_unlock(&mp->m_sb_lock);
 	return 0;
 }
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index df5ec55..8e8924f 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1136,13 +1136,12 @@ xfs_mod_incore_sb_unlocked(
 		}
 		return 0;
 	case XFS_SBS_IFREE:
-		lcounter = (long long)mp->m_sb.sb_ifree;
-		lcounter += delta;
-		if (lcounter < 0) {
+		percpu_counter_add(&mp->m_sb.sb_ifree, delta);
+		if (percpu_counter_compare(&mp->m_sb.sb_ifree, 0) < 0) {
 			ASSERT(0);
+			percpu_counter_add(&mp->m_sb.sb_ifree, -delta);
 			return -EINVAL;
 		}
-		mp->m_sb.sb_ifree = lcounter;
 		return 0;
 	case XFS_SBS_FDBLOCKS:
 		lcounter = (long long)
@@ -1288,9 +1287,9 @@ xfs_mod_incore_sb(
 	int			status;
 
 #ifdef HAVE_PERCPU_SB
-	ASSERT(field < XFS_SBS_IFREE || field > XFS_SBS_FDBLOCKS);
+	ASSERT(field != XFS_SBS_FDBLOCKS);
 #endif
-	if (field == XFS_SBS_ICOUNT)
+	if (field == XFS_SBS_ICOUNT || field == XFS_SBS_IFREE)
 		return xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
 
 	spin_lock(&mp->m_sb_lock);
@@ -1495,7 +1494,6 @@ xfs_icsb_cpu_notify(
 	case CPU_ONLINE:
 	case CPU_ONLINE_FROZEN:
 		xfs_icsb_lock(mp);
-		xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
 		xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
 		xfs_icsb_unlock(mp);
 		break;
@@ -1506,15 +1504,12 @@ xfs_icsb_cpu_notify(
 		 * re-enable the counters. */
 		xfs_icsb_lock(mp);
 		spin_lock(&mp->m_sb_lock);
-		xfs_icsb_disable_counter(mp, XFS_SBS_IFREE);
 		xfs_icsb_disable_counter(mp, XFS_SBS_FDBLOCKS);
 
-		mp->m_sb.sb_ifree += cntp->icsb_ifree;
 		mp->m_sb.sb_fdblocks += cntp->icsb_fdblocks;
 
 		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
 
-		xfs_icsb_balance_counter_locked(mp, XFS_SBS_IFREE, 0);
 		xfs_icsb_balance_counter_locked(mp, XFS_SBS_FDBLOCKS, 0);
 		spin_unlock(&mp->m_sb_lock);
 		xfs_icsb_unlock(mp);
@@ -1536,11 +1531,13 @@ xfs_icsb_init_counters(
 	if (i)
 		return ENOMEM;
 
+	i = percpu_counter_init(&mp->m_sb.sb_ifree, 0, GFP_KERNEL);
+	if (i)
+		goto free_icount;
+
 	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
-	if (!mp->m_sb_cnts) {
-		percpu_counter_destroy(&mp->m_sb.sb_icount);
-		return -ENOMEM;
-	}
+	if (!mp->m_sb_cnts)
+		goto free_ifree;
 
 	for_each_online_cpu(i) {
 		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
@@ -1562,6 +1559,12 @@ xfs_icsb_init_counters(
 #endif /* CONFIG_HOTPLUG_CPU */
 
 	return 0;
+
+free_ifree:
+	percpu_counter_destroy(&mp->m_sb.sb_ifree);
+free_icount:
+	percpu_counter_destroy(&mp->m_sb.sb_icount);
+	return -ENOMEM;
 }
 
 void
@@ -1574,7 +1577,6 @@ xfs_icsb_reinit_counters(
 	 * initial balance kicks us off correctly
 	 */
 	mp->m_icsb_counters = -1;
-	xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
 	xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
 	xfs_icsb_unlock(mp);
 }
@@ -1589,6 +1591,7 @@ xfs_icsb_destroy_counters(
 	}
 
 	percpu_counter_destroy(&mp->m_sb.sb_icount);
+	percpu_counter_destroy(&mp->m_sb.sb_ifree);
 
 	mutex_destroy(&mp->m_icsb_mutex);
 }
@@ -1652,7 +1655,6 @@ xfs_icsb_count(
 
 	for_each_online_cpu(i) {
 		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		cnt->icsb_ifree += cntp->icsb_ifree;
 		cnt->icsb_fdblocks += cntp->icsb_fdblocks;
 	}
 
@@ -1665,7 +1667,7 @@ xfs_icsb_counter_disabled(
 	xfs_mount_t	*mp,
 	xfs_sb_field_t	field)
 {
-	ASSERT((field >= XFS_SBS_IFREE) && (field <= XFS_SBS_FDBLOCKS));
+	ASSERT(field == XFS_SBS_FDBLOCKS);
 	return test_bit(field, &mp->m_icsb_counters);
 }
 
@@ -1676,7 +1678,7 @@ xfs_icsb_disable_counter(
 {
 	xfs_icsb_cnts_t	cnt;
 
-	ASSERT((field >= XFS_SBS_IFREE) && (field <= XFS_SBS_FDBLOCKS));
+	ASSERT(field == XFS_SBS_FDBLOCKS);
 
 	/*
 	 * If we are already disabled, then there is nothing to do
@@ -1695,9 +1697,6 @@ xfs_icsb_disable_counter(
 
 		xfs_icsb_count(mp, &cnt, XFS_ICSB_LAZY_COUNT);
 		switch(field) {
-		case XFS_SBS_IFREE:
-			mp->m_sb.sb_ifree = cnt.icsb_ifree;
-			break;
 		case XFS_SBS_FDBLOCKS:
 			mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
 			break;
@@ -1719,15 +1718,12 @@ xfs_icsb_enable_counter(
 	xfs_icsb_cnts_t	*cntp;
 	int		i;
 
-	ASSERT((field >= XFS_SBS_IFREE) && (field <= XFS_SBS_FDBLOCKS));
+	ASSERT(field == XFS_SBS_FDBLOCKS);
 
 	xfs_icsb_lock_all_counters(mp);
 	for_each_online_cpu(i) {
 		cntp = per_cpu_ptr(mp->m_sb_cnts, i);
 		switch (field) {
-		case XFS_SBS_IFREE:
-			cntp->icsb_ifree = count + resid;
-			break;
 		case XFS_SBS_FDBLOCKS:
 			cntp->icsb_fdblocks = count + resid;
 			break;
@@ -1750,8 +1746,6 @@ xfs_icsb_sync_counters_locked(
 
 	xfs_icsb_count(mp, &cnt, flags);
 
-	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_IFREE))
-		mp->m_sb.sb_ifree = cnt.icsb_ifree;
 	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_FDBLOCKS))
 		mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
 }
@@ -1803,12 +1797,6 @@ xfs_icsb_balance_counter_locked(
 
 	/* update counters  - first CPU gets residual*/
 	switch (field) {
-	case XFS_SBS_IFREE:
-		count = mp->m_sb.sb_ifree;
-		resid = do_div(count, weight);
-		if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE))
-			return;
-		break;
 	case XFS_SBS_FDBLOCKS:
 		count = mp->m_sb.sb_fdblocks;
 		resid = do_div(count, weight);
@@ -1863,14 +1851,6 @@ again:
 	}
 
 	switch (field) {
-	case XFS_SBS_IFREE:
-		lcounter = icsbp->icsb_ifree;
-		lcounter += delta;
-		if (unlikely(lcounter < 0))
-			goto balance_counter;
-		icsbp->icsb_ifree = lcounter;
-		break;
-
 	case XFS_SBS_FDBLOCKS:
 		BUG_ON((mp->m_resblks - mp->m_resblks_avail) != 0);
 
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 408e2fe..c17bfa4 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1088,6 +1088,7 @@ xfs_fs_statfs(
 	struct xfs_inode	*ip = XFS_I(dentry->d_inode);
 	__uint64_t		fakeinos, id;
 	__uint64_t		sb_icount;
+	__uint64_t		sb_ifree;
 	xfs_extlen_t		lsize;
 	__int64_t		ffree;
 
@@ -1100,6 +1101,7 @@ xfs_fs_statfs(
 
 	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
 	sb_icount = percpu_counter_sum(&sbp->sb_icount);
+	sb_ifree = percpu_counter_sum(&sbp->sb_ifree);
 
 	spin_lock(&mp->m_sb_lock);
 	statp->f_bsize = sbp->sb_blocksize;
@@ -1116,7 +1118,7 @@ xfs_fs_statfs(
 					mp->m_maxicount);
 
 	/* make sure statp->f_ffree does not underflow */
-	ffree = statp->f_files - (sb_icount - sbp->sb_ifree);
+	ffree = statp->f_files - (sb_icount - sb_ifree);
 	statp->f_ffree = max_t(__int64_t, ffree, 0);
 
 	spin_unlock(&mp->m_sb_lock);
diff --git a/fs/xfs/xfs_super.h b/fs/xfs/xfs_super.h
index fa5603c..6efc7a2 100644
--- a/fs/xfs/xfs_super.h
+++ b/fs/xfs/xfs_super.h
@@ -98,8 +98,8 @@ struct xfs_sb {
 					/* statistics */
 
 	struct percpu_counter	sb_icount;	/* allocated inodes */
+	struct percpu_counter	sb_ifree;	/* free inodes */
 
-	__uint64_t	sb_ifree;	/* free inodes */
 	__uint64_t	sb_fdblocks;	/* free data blocks */
 	__uint64_t	sb_frextents;	/* free realtime extents */
 	xfs_ino_t	sb_uquotino;	/* user quota inode */
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index d78b0ae..c54d4b7 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -560,8 +560,7 @@ xfs_trans_unreserve_and_mod_sb(
 	}
 
 	if (ifreedelta) {
-		error = xfs_icsb_modify_counters(mp, XFS_SBS_IFREE,
-						 ifreedelta, rsvd);
+		error = xfs_mod_incore_sb(mp, XFS_SBS_IFREE, ifreedelta, rsvd);
 		if (error)
 			goto out_undo_icount;
 	}
@@ -630,7 +629,7 @@ xfs_trans_unreserve_and_mod_sb(
 
 out_undo_ifreecount:
 	if (ifreedelta)
-		xfs_icsb_modify_counters(mp, XFS_SBS_IFREE, -ifreedelta, rsvd);
+		xfs_mod_incore_sb(mp, XFS_SBS_IFREE, -ifreedelta, rsvd);
 out_undo_icount:
 	if (idelta)
 		xfs_mod_incore_sb(mp, XFS_SBS_ICOUNT, -idelta, rsvd);
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 4/5] xfs: use generic percpu counters for free block counter
  2015-02-01 21:42 [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Dave Chinner
                   ` (2 preceding siblings ...)
  2015-02-01 21:43 ` [PATCH 3/5] xfs: use generic percpu counters for free " Dave Chinner
@ 2015-02-01 21:43 ` Dave Chinner
  2015-02-02 16:48   ` Christoph Hellwig
  2015-02-02 17:11   ` Brian Foster
  2015-02-01 21:43 ` [PATCH 5/5] xfs: Remove icsb infrastructure Dave Chinner
  2015-02-03 21:50 ` [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Christoph Hellwig
  5 siblings, 2 replies; 24+ messages in thread
From: Dave Chinner @ 2015-02-01 21:43 UTC (permalink / raw)
  To: xfs

XFS has hand-rolled per-cpu counters for the superblock since before
there was any generic implementation. The free block counter is
special in that it is used for ENOSPC detection outside transaction
contexts for for delayed allocation. This means that the counter
needs to be accurate at zero. The current per-cpu counter code jumps
through lots of hoops to ensure we never run past zero, but we don't
need to make all those jumps with the generic counter
implementation.

The generic counter implementation allows us to pass a "batch"
threshold at which the addition/subtraction to the counter value
will be folded back into global value under lock. We can use this
feature to reduce the batch size as we approach 0 in a very similar
manner to the existing counters and their rebalance algorithm. If we
use a batch size of 1 as we approach 0, then every addition and
subtraction will be done against the global value and hence allow
accurate detection of zero threshold crossing.

Hence we can replace the handrolled, accurate-at-zero counters with
generic percpu counters.

Note: this removes just enough of the icsb infrastructure to compile
without warnings. The rest will go in subsequent commits.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_sb.c |  11 +++-
 fs/xfs/xfs_fsops.c     |   9 +--
 fs/xfs/xfs_iomap.c     |   2 +-
 fs/xfs/xfs_mount.c     | 167 ++++++++++++++++++++++++-------------------------
 fs/xfs/xfs_super.c     |  11 +++-
 fs/xfs/xfs_super.h     |   2 +-
 fs/xfs/xfs_trans.c     |   9 ++-
 7 files changed, 109 insertions(+), 102 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index 42e5c89..bdde5c7 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -357,11 +357,15 @@ __xfs_sb_from_disk(
 	to->sb_rextslog = from->sb_rextslog;
 	to->sb_inprogress = from->sb_inprogress;
 	to->sb_imax_pct = from->sb_imax_pct;
+
 	if (percpu_counter_initialized(&to->sb_icount))
 		percpu_counter_set(&to->sb_icount, be64_to_cpu(from->sb_icount));
 	if (percpu_counter_initialized(&to->sb_icount))
 		percpu_counter_set(&to->sb_ifree, be64_to_cpu(from->sb_ifree));
-	to->sb_fdblocks = be64_to_cpu(from->sb_fdblocks);
+	if (percpu_counter_initialized(&to->sb_fdblocks))
+		percpu_counter_set(&to->sb_fdblocks,
+				    be64_to_cpu(from->sb_fdblocks));
+
 	to->sb_frextents = be64_to_cpu(from->sb_frextents);
 	to->sb_uquotino = be64_to_cpu(from->sb_uquotino);
 	to->sb_gquotino = be64_to_cpu(from->sb_gquotino);
@@ -496,7 +500,7 @@ xfs_sb_to_disk(
 	to->sb_imax_pct = from->sb_imax_pct;
 	to->sb_icount = cpu_to_be64(percpu_counter_sum(&from->sb_icount));
 	to->sb_ifree = cpu_to_be64(percpu_counter_sum(&from->sb_ifree));
-	to->sb_fdblocks = cpu_to_be64(from->sb_fdblocks);
+	to->sb_fdblocks = cpu_to_be64(percpu_counter_sum(&from->sb_fdblocks));
 	to->sb_frextents = cpu_to_be64(from->sb_frextents);
 
 	to->sb_flags = from->sb_flags;
@@ -542,6 +546,7 @@ xfs_sb_verify(
 	/* don't need to validate icount here */
 	sb.sb_icount.counters = NULL;
 	sb.sb_ifree.counters = NULL;
+	sb.sb_fdblocks.counters = NULL;
 
 	/*
 	 * Use call variant which doesn't convert quota flags from disk 
@@ -755,7 +760,7 @@ xfs_initialize_perag_data(
 	spin_lock(&mp->m_sb_lock);
 	percpu_counter_set(&sbp->sb_ifree, ifree);
 	percpu_counter_set(&sbp->sb_icount, ialloc);
-	sbp->sb_fdblocks = bfree + bfreelst + btree;
+	percpu_counter_set(&sbp->sb_fdblocks, bfree + bfreelst + btree);
 	spin_unlock(&mp->m_sb_lock);
 
 	/* Fixup the per-cpu counters as well. */
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 619a9f3..ccb00cd 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -633,9 +633,10 @@ xfs_fs_counts(
 	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
 	cnt->allocino = percpu_counter_read_positive(&mp->m_sb.sb_icount);
 	cnt->freeino = percpu_counter_read_positive(&mp->m_sb.sb_ifree);
+	cnt->freedata = percpu_counter_read_positive(&mp->m_sb.sb_fdblocks) -
+							XFS_ALLOC_SET_ASIDE(mp);
 
 	spin_lock(&mp->m_sb_lock);
-	cnt->freedata = mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
 	cnt->freertx = mp->m_sb.sb_frextents;
 	spin_unlock(&mp->m_sb_lock);
 	return 0;
@@ -710,7 +711,8 @@ retry:
 	} else {
 		__int64_t	free;
 
-		free =  mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
+		free = percpu_counter_sum(&mp->m_sb.sb_fdblocks) -
+							XFS_ALLOC_SET_ASIDE(mp);
 		if (!free)
 			goto out; /* ENOSPC and fdblks_delta = 0 */
 
@@ -749,8 +751,7 @@ out:
 		 * the extra reserve blocks from the reserve.....
 		 */
 		int error;
-		error = xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
-						 fdblks_delta, 0);
+		error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, fdblks_delta, 0);
 		if (error == -ENOSPC)
 			goto retry;
 	}
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index ccb1dd0..310433a 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -461,7 +461,7 @@ xfs_iomap_prealloc_size(
 				       alloc_blocks);
 
 	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
-	freesp = mp->m_sb.sb_fdblocks;
+	freesp = percpu_counter_read_positive(&mp->m_sb.sb_fdblocks);
 	if (freesp < mp->m_low_space[XFS_LOWSP_5_PCNT]) {
 		shift = 2;
 		if (freesp < mp->m_low_space[XFS_LOWSP_4_PCNT])
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 8e8924f..0e37248 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1117,7 +1117,8 @@ xfs_mod_incore_sb_unlocked(
 {
 	int		scounter;	/* short counter for 32 bit fields */
 	long long	lcounter;	/* long counter for 64 bit fields */
-	long long	res_used, rem;
+	long long	res_used;
+	s32		batch;
 
 	/*
 	 * With the in-core superblock spin lock held, switch
@@ -1144,47 +1145,80 @@ xfs_mod_incore_sb_unlocked(
 		}
 		return 0;
 	case XFS_SBS_FDBLOCKS:
-		lcounter = (long long)
-			mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
-		res_used = (long long)(mp->m_resblks - mp->m_resblks_avail);
 
 		if (delta > 0) {		/* Putting blocks back */
+			if (mp->m_resblks == mp->m_resblks_avail) {
+				percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
+				return 0;
+			}
+
+			/* put blocks back into reserve pool first */
+			spin_lock(&mp->m_sb_lock);
+			res_used = (long long)
+					(mp->m_resblks - mp->m_resblks_avail);
+
 			if (res_used > delta) {
 				mp->m_resblks_avail += delta;
 			} else {
-				rem = delta - res_used;
+				delta -= res_used;
 				mp->m_resblks_avail = mp->m_resblks;
-				lcounter += rem;
-			}
-		} else {				/* Taking blocks away */
-			lcounter += delta;
-			if (lcounter >= 0) {
-				mp->m_sb.sb_fdblocks = lcounter +
-							XFS_ALLOC_SET_ASIDE(mp);
-				return 0;
+				percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
 			}
+			spin_unlock(&mp->m_sb_lock);
+			return 0;
 
-			/*
-			 * We are out of blocks, use any available reserved
-			 * blocks if were allowed to.
-			 */
-			if (!rsvd)
-				return -ENOSPC;
+		}
 
-			lcounter = (long long)mp->m_resblks_avail + delta;
-			if (lcounter >= 0) {
-				mp->m_resblks_avail = lcounter;
-				return 0;
-			}
-			printk_once(KERN_WARNING
-				"Filesystem \"%s\": reserve blocks depleted! "
-				"Consider increasing reserve pool size.",
-				mp->m_fsname);
-			return -ENOSPC;
+		/*
+		 * Taking blocks away, need to be more accurate the closer we
+		 * are to zero.
+		 *
+		 * batch size is set to a maximum of 1024 blocks - if we are
+		 * allocating of freeing extents larger than this then we aren't
+		 * going to be hammering the counter lock so a lock per update
+		 * is not a problem.
+		 *
+		 * If the counter has a value of less than 2 * max batch size,
+		 * then make everything serialise as we are real close to
+		 * ENOSPC.
+		 */
+#define __BATCH	1024
+		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
+					   2 * __BATCH) < 0)
+			batch = 1;
+		else
+			batch = __BATCH;
+
+		__percpu_counter_add(&mp->m_sb.sb_fdblocks, delta, batch);
+		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
+					   XFS_ALLOC_SET_ASIDE(mp)) >= 0) {
+			/* we had space! */
+			return 0;
 		}
 
-		mp->m_sb.sb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
-		return 0;
+		/*
+		 * lock up the sb for dipping into reserves before releasing
+		 * the space that took us to ENOSPC.
+		 */
+		spin_lock(&mp->m_sb_lock);
+		percpu_counter_add(&mp->m_sb.sb_fdblocks, -delta);
+		if (!rsvd)
+			goto fdblocks_enospc;
+
+		lcounter = (long long)mp->m_resblks_avail + delta;
+		if (lcounter >= 0) {
+			mp->m_resblks_avail = lcounter;
+			spin_unlock(&mp->m_sb_lock);
+			return 0;
+		}
+		printk_once(KERN_WARNING
+			"Filesystem \"%s\": reserve blocks depleted! "
+			"Consider increasing reserve pool size.",
+			mp->m_fsname);
+fdblocks_enospc:
+		spin_unlock(&mp->m_sb_lock);
+		return -ENOSPC;
+
 	case XFS_SBS_FREXTENTS:
 		lcounter = (long long)mp->m_sb.sb_frextents;
 		lcounter += delta;
@@ -1286,11 +1320,14 @@ xfs_mod_incore_sb(
 {
 	int			status;
 
-#ifdef HAVE_PERCPU_SB
-	ASSERT(field != XFS_SBS_FDBLOCKS);
-#endif
-	if (field == XFS_SBS_ICOUNT || field == XFS_SBS_IFREE)
+	switch (field) {
+	case XFS_SBS_ICOUNT:
+	case XFS_SBS_IFREE:
+	case XFS_SBS_FDBLOCKS:
 		return xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
+	default:
+		break;
+	}
 
 	spin_lock(&mp->m_sb_lock);
 	status = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
@@ -1309,7 +1346,7 @@ xfs_mod_incore_sb(
  *
  * Note that this function may not be used for the superblock values that
  * are tracked with the in-memory per-cpu counters - a direct call to
- * xfs_icsb_modify_counters is required for these.
+ * xfs_mod_incore_sb is required for these.
  */
 int
 xfs_mod_incore_sb_batch(
@@ -1494,7 +1531,6 @@ xfs_icsb_cpu_notify(
 	case CPU_ONLINE:
 	case CPU_ONLINE_FROZEN:
 		xfs_icsb_lock(mp);
-		xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
 		xfs_icsb_unlock(mp);
 		break;
 	case CPU_DEAD:
@@ -1504,13 +1540,9 @@ xfs_icsb_cpu_notify(
 		 * re-enable the counters. */
 		xfs_icsb_lock(mp);
 		spin_lock(&mp->m_sb_lock);
-		xfs_icsb_disable_counter(mp, XFS_SBS_FDBLOCKS);
-
-		mp->m_sb.sb_fdblocks += cntp->icsb_fdblocks;
 
 		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
 
-		xfs_icsb_balance_counter_locked(mp, XFS_SBS_FDBLOCKS, 0);
 		spin_unlock(&mp->m_sb_lock);
 		xfs_icsb_unlock(mp);
 		break;
@@ -1535,9 +1567,13 @@ xfs_icsb_init_counters(
 	if (i)
 		goto free_icount;
 
+	i = percpu_counter_init(&mp->m_sb.sb_fdblocks, 0, GFP_KERNEL);
+	if (i)
+		goto free_ifree;
+
 	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
 	if (!mp->m_sb_cnts)
-		goto free_ifree;
+		goto free_fdblocks;
 
 	for_each_online_cpu(i) {
 		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
@@ -1560,6 +1596,8 @@ xfs_icsb_init_counters(
 
 	return 0;
 
+free_fdblocks:
+	percpu_counter_destroy(&mp->m_sb.sb_fdblocks);
 free_ifree:
 	percpu_counter_destroy(&mp->m_sb.sb_ifree);
 free_icount:
@@ -1577,7 +1615,6 @@ xfs_icsb_reinit_counters(
 	 * initial balance kicks us off correctly
 	 */
 	mp->m_icsb_counters = -1;
-	xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
 	xfs_icsb_unlock(mp);
 }
 
@@ -1592,6 +1629,7 @@ xfs_icsb_destroy_counters(
 
 	percpu_counter_destroy(&mp->m_sb.sb_icount);
 	percpu_counter_destroy(&mp->m_sb.sb_ifree);
+	percpu_counter_destroy(&mp->m_sb.sb_fdblocks);
 
 	mutex_destroy(&mp->m_icsb_mutex);
 }
@@ -1645,18 +1683,11 @@ xfs_icsb_count(
 	xfs_icsb_cnts_t	*cnt,
 	int		flags)
 {
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
 	memset(cnt, 0, sizeof(xfs_icsb_cnts_t));
 
 	if (!(flags & XFS_ICSB_LAZY_COUNT))
 		xfs_icsb_lock_all_counters(mp);
 
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		cnt->icsb_fdblocks += cntp->icsb_fdblocks;
-	}
 
 	if (!(flags & XFS_ICSB_LAZY_COUNT))
 		xfs_icsb_unlock_all_counters(mp);
@@ -1667,7 +1698,6 @@ xfs_icsb_counter_disabled(
 	xfs_mount_t	*mp,
 	xfs_sb_field_t	field)
 {
-	ASSERT(field == XFS_SBS_FDBLOCKS);
 	return test_bit(field, &mp->m_icsb_counters);
 }
 
@@ -1678,8 +1708,6 @@ xfs_icsb_disable_counter(
 {
 	xfs_icsb_cnts_t	cnt;
 
-	ASSERT(field == XFS_SBS_FDBLOCKS);
-
 	/*
 	 * If we are already disabled, then there is nothing to do
 	 * here. We check before locking all the counters to avoid
@@ -1697,9 +1725,6 @@ xfs_icsb_disable_counter(
 
 		xfs_icsb_count(mp, &cnt, XFS_ICSB_LAZY_COUNT);
 		switch(field) {
-		case XFS_SBS_FDBLOCKS:
-			mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
-			break;
 		default:
 			BUG();
 		}
@@ -1715,18 +1740,11 @@ xfs_icsb_enable_counter(
 	uint64_t	count,
 	uint64_t	resid)
 {
-	xfs_icsb_cnts_t	*cntp;
 	int		i;
 
-	ASSERT(field == XFS_SBS_FDBLOCKS);
-
 	xfs_icsb_lock_all_counters(mp);
 	for_each_online_cpu(i) {
-		cntp = per_cpu_ptr(mp->m_sb_cnts, i);
 		switch (field) {
-		case XFS_SBS_FDBLOCKS:
-			cntp->icsb_fdblocks = count + resid;
-			break;
 		default:
 			BUG();
 			break;
@@ -1745,9 +1763,6 @@ xfs_icsb_sync_counters_locked(
 	xfs_icsb_cnts_t	cnt;
 
 	xfs_icsb_count(mp, &cnt, flags);
-
-	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_FDBLOCKS))
-		mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
 }
 
 /*
@@ -1789,20 +1804,12 @@ xfs_icsb_balance_counter_locked(
 	int		min_per_cpu)
 {
 	uint64_t	count, resid;
-	int		weight = num_online_cpus();
-	uint64_t	min = (uint64_t)min_per_cpu;
 
 	/* disable counter and sync counter */
 	xfs_icsb_disable_counter(mp, field);
 
 	/* update counters  - first CPU gets residual*/
 	switch (field) {
-	case XFS_SBS_FDBLOCKS:
-		count = mp->m_sb.sb_fdblocks;
-		resid = do_div(count, weight);
-		if (count < max(min, XFS_ICSB_FDBLK_CNTR_REENABLE(mp)))
-			return;
-		break;
 	default:
 		BUG();
 		count = resid = 0;	/* quiet, gcc */
@@ -1831,7 +1838,6 @@ xfs_icsb_modify_counters(
 	int		rsvd)
 {
 	xfs_icsb_cnts_t	*icsbp;
-	long long	lcounter;	/* long counter for 64 bit fields */
 	int		ret = 0;
 
 	might_sleep();
@@ -1851,18 +1857,9 @@ again:
 	}
 
 	switch (field) {
-	case XFS_SBS_FDBLOCKS:
-		BUG_ON((mp->m_resblks - mp->m_resblks_avail) != 0);
-
-		lcounter = icsbp->icsb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
-		lcounter += delta;
-		if (unlikely(lcounter < 0))
-			goto balance_counter;
-		icsbp->icsb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
-		break;
 	default:
 		BUG();
-		break;
+		goto balance_counter; /* be still, gcc */
 	}
 	xfs_icsb_unlock_cntr(icsbp);
 	preempt_enable();
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index c17bfa4..0fa688a 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1089,6 +1089,7 @@ xfs_fs_statfs(
 	__uint64_t		fakeinos, id;
 	__uint64_t		sb_icount;
 	__uint64_t		sb_ifree;
+	__uint64_t		sb_fdblocks;
 	xfs_extlen_t		lsize;
 	__int64_t		ffree;
 
@@ -1100,15 +1101,20 @@ xfs_fs_statfs(
 	statp->f_fsid.val[1] = (u32)(id >> 32);
 
 	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
+
 	sb_icount = percpu_counter_sum(&sbp->sb_icount);
 	sb_ifree = percpu_counter_sum(&sbp->sb_ifree);
+	sb_fdblocks = percpu_counter_sum(&sbp->sb_fdblocks);
 
 	spin_lock(&mp->m_sb_lock);
 	statp->f_bsize = sbp->sb_blocksize;
 	lsize = sbp->sb_logstart ? sbp->sb_logblocks : 0;
 	statp->f_blocks = sbp->sb_dblocks - lsize;
-	statp->f_bfree = statp->f_bavail =
-				sbp->sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
+	spin_unlock(&mp->m_sb_lock);
+
+	statp->f_bfree = sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
+	statp->f_bavail = statp->f_bfree;
+
 	fakeinos = statp->f_bfree << sbp->sb_inopblog;
 	statp->f_files = MIN(sb_icount + fakeinos,
 			     (__uint64_t)XFS_MAXINUMBER);
@@ -1121,7 +1127,6 @@ xfs_fs_statfs(
 	ffree = statp->f_files - (sb_icount - sb_ifree);
 	statp->f_ffree = max_t(__int64_t, ffree, 0);
 
-	spin_unlock(&mp->m_sb_lock);
 
 	if ((ip->i_d.di_flags & XFS_DIFLAG_PROJINHERIT) &&
 	    ((mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))) ==
diff --git a/fs/xfs/xfs_super.h b/fs/xfs/xfs_super.h
index 6efc7a2..f649d1a 100644
--- a/fs/xfs/xfs_super.h
+++ b/fs/xfs/xfs_super.h
@@ -99,8 +99,8 @@ struct xfs_sb {
 
 	struct percpu_counter	sb_icount;	/* allocated inodes */
 	struct percpu_counter	sb_ifree;	/* free inodes */
+	struct percpu_counter	sb_fdblocks;	/* free data blocks */
 
-	__uint64_t	sb_fdblocks;	/* free data blocks */
 	__uint64_t	sb_frextents;	/* free realtime extents */
 	xfs_ino_t	sb_uquotino;	/* user quota inode */
 	xfs_ino_t	sb_gquotino;	/* group quota inode */
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index c54d4b7..b7da423 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -184,7 +184,7 @@ xfs_trans_reserve(
 	 * fail if the count would go below zero.
 	 */
 	if (blocks > 0) {
-		error = xfs_icsb_modify_counters(tp->t_mountp, XFS_SBS_FDBLOCKS,
+		error = xfs_mod_incore_sb(tp->t_mountp, XFS_SBS_FDBLOCKS,
 					  -((int64_t)blocks), rsvd);
 		if (error != 0) {
 			current_restore_flags_nested(&tp->t_pflags, PF_FSTRANS);
@@ -268,7 +268,7 @@ undo_log:
 
 undo_blocks:
 	if (blocks > 0) {
-		xfs_icsb_modify_counters(tp->t_mountp, XFS_SBS_FDBLOCKS,
+		xfs_mod_incore_sb(tp->t_mountp, XFS_SBS_FDBLOCKS,
 					 (int64_t)blocks, rsvd);
 		tp->t_blk_res = 0;
 	}
@@ -547,8 +547,7 @@ xfs_trans_unreserve_and_mod_sb(
 
 	/* apply the per-cpu counters */
 	if (blkdelta) {
-		error = xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
-						 blkdelta, rsvd);
+		error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, blkdelta, rsvd);
 		if (error)
 			goto out;
 	}
@@ -635,7 +634,7 @@ out_undo_icount:
 		xfs_mod_incore_sb(mp, XFS_SBS_ICOUNT, -idelta, rsvd);
 out_undo_fdblocks:
 	if (blkdelta)
-		xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS, -blkdelta, rsvd);
+		xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, -blkdelta, rsvd);
 out:
 	ASSERT(error == 0);
 	return;
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH 5/5] xfs: Remove icsb infrastructure
  2015-02-01 21:42 [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Dave Chinner
                   ` (3 preceding siblings ...)
  2015-02-01 21:43 ` [PATCH 4/5] xfs: use generic percpu counters for free block counter Dave Chinner
@ 2015-02-01 21:43 ` Dave Chinner
  2015-02-02 17:11   ` Brian Foster
  2015-02-03 21:50 ` [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Christoph Hellwig
  5 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2015-02-01 21:43 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

Now that the in-cor superblock infrastructure has been replaced with
generic per-cpu counters, we don't need it anymore. Nuke it from
orbit so we are sure that it won't haunt us again...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/libxfs/xfs_bmap.c |  16 +-
 fs/xfs/libxfs/xfs_sb.c   |  10 +-
 fs/xfs/xfs_fsops.c       |   2 -
 fs/xfs/xfs_iomap.c       |   1 -
 fs/xfs/xfs_linux.h       |   9 -
 fs/xfs/xfs_log_recover.c |   3 -
 fs/xfs/xfs_mount.c       | 509 -----------------------------------------------
 fs/xfs/xfs_mount.h       |  64 +-----
 fs/xfs/xfs_super.c       |  76 +++++--
 9 files changed, 67 insertions(+), 623 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 61ec015..ac4d64e 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -2212,7 +2212,7 @@ xfs_bmap_add_extent_delay_real(
 		diff = (int)(temp + temp2 - startblockval(PREV.br_startblock) -
 			(bma->cur ? bma->cur->bc_private.b.allocated : 0));
 		if (diff > 0) {
-			error = xfs_icsb_modify_counters(bma->ip->i_mount,
+			error = xfs_mod_incore_sb(bma->ip->i_mount,
 					XFS_SBS_FDBLOCKS,
 					-((int64_t)diff), 0);
 			ASSERT(!error);
@@ -2265,7 +2265,7 @@ xfs_bmap_add_extent_delay_real(
 			temp += bma->cur->bc_private.b.allocated;
 		ASSERT(temp <= da_old);
 		if (temp < da_old)
-			xfs_icsb_modify_counters(bma->ip->i_mount,
+			xfs_mod_incore_sb(bma->ip->i_mount,
 					XFS_SBS_FDBLOCKS,
 					(int64_t)(da_old - temp), 0);
 	}
@@ -2944,7 +2944,7 @@ xfs_bmap_add_extent_hole_delay(
 	}
 	if (oldlen != newlen) {
 		ASSERT(oldlen > newlen);
-		xfs_icsb_modify_counters(ip->i_mount, XFS_SBS_FDBLOCKS,
+		xfs_mod_incore_sb(ip->i_mount, XFS_SBS_FDBLOCKS,
 			(int64_t)(oldlen - newlen), 0);
 		/*
 		 * Nothing to do for disk quota accounting here.
@@ -4163,14 +4163,14 @@ xfs_bmapi_reserve_delalloc(
 		error = xfs_mod_incore_sb(mp, XFS_SBS_FREXTENTS,
 					  -((int64_t)extsz), 0);
 	} else {
-		error = xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
+		error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS,
 						 -((int64_t)alen), 0);
 	}
 
 	if (error)
 		goto out_unreserve_quota;
 
-	error = xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
+	error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS,
 					 -((int64_t)indlen), 0);
 	if (error)
 		goto out_unreserve_blocks;
@@ -4200,7 +4200,7 @@ out_unreserve_blocks:
 	if (rt)
 		xfs_mod_incore_sb(mp, XFS_SBS_FREXTENTS, extsz, 0);
 	else
-		xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS, alen, 0);
+		xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, alen, 0);
 out_unreserve_quota:
 	if (XFS_IS_QUOTA_ON(mp))
 		xfs_trans_unreserve_quota_nblks(NULL, ip, (long)alen, 0, rt ?
@@ -5013,7 +5013,7 @@ xfs_bmap_del_extent(
 	 */
 	ASSERT(da_old >= da_new);
 	if (da_old > da_new) {
-		xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
+		xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS,
 			(int64_t)(da_old - da_new), 0);
 	}
 done:
@@ -5290,7 +5290,7 @@ xfs_bunmapi(
 					ip, -((long)del.br_blockcount), 0,
 					XFS_QMOPT_RES_RTBLKS);
 			} else {
-				xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
+				xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS,
 						(int64_t)del.br_blockcount, 0);
 				(void)xfs_trans_reserve_quota_nblks(NULL,
 					ip, -((long)del.br_blockcount), 0,
diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
index bdde5c7..676f2a1 100644
--- a/fs/xfs/libxfs/xfs_sb.c
+++ b/fs/xfs/libxfs/xfs_sb.c
@@ -754,17 +754,11 @@ xfs_initialize_perag_data(
 		btree += pag->pagf_btreeblks;
 		xfs_perag_put(pag);
 	}
-	/*
-	 * Overwrite incore superblock counters with just-read data
-	 */
-	spin_lock(&mp->m_sb_lock);
+
+	/* Overwrite incore superblock counters with just-read data */
 	percpu_counter_set(&sbp->sb_ifree, ifree);
 	percpu_counter_set(&sbp->sb_icount, ialloc);
 	percpu_counter_set(&sbp->sb_fdblocks, bfree + bfreelst + btree);
-	spin_unlock(&mp->m_sb_lock);
-
-	/* Fixup the per-cpu counters as well. */
-	xfs_icsb_reinit_counters(mp);
 
 	return 0;
 }
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index ccb00cd..28389e0 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -630,7 +630,6 @@ xfs_fs_counts(
 	xfs_mount_t		*mp,
 	xfs_fsop_counts_t	*cnt)
 {
-	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
 	cnt->allocino = percpu_counter_read_positive(&mp->m_sb.sb_icount);
 	cnt->freeino = percpu_counter_read_positive(&mp->m_sb.sb_ifree);
 	cnt->freedata = percpu_counter_read_positive(&mp->m_sb.sb_fdblocks) -
@@ -694,7 +693,6 @@ xfs_reserve_blocks(
 	 */
 retry:
 	spin_lock(&mp->m_sb_lock);
-	xfs_icsb_sync_counters_locked(mp, 0);
 
 	/*
 	 * If our previous reservation was larger than the current value,
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 310433a..67a41f2 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -460,7 +460,6 @@ xfs_iomap_prealloc_size(
 	alloc_blocks = XFS_FILEOFF_MIN(roundup_pow_of_two(MAXEXTLEN),
 				       alloc_blocks);
 
-	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
 	freesp = percpu_counter_read_positive(&mp->m_sb.sb_fdblocks);
 	if (freesp < mp->m_low_space[XFS_LOWSP_5_PCNT]) {
 		shift = 2;
diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index c31d2c2..7c7842c 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -116,15 +116,6 @@ typedef __uint64_t __psunsigned_t;
 #undef XFS_NATIVE_HOST
 #endif
 
-/*
- * Feature macros (disable/enable)
- */
-#ifdef CONFIG_SMP
-#define HAVE_PERCPU_SB	/* per cpu superblock counters are a 2.6 feature */
-#else
-#undef  HAVE_PERCPU_SB	/* per cpu superblock counters are a 2.6 feature */
-#endif
-
 #define irix_sgid_inherit	xfs_params.sgid_inherit.val
 #define irix_symlink_mode	xfs_params.symlink_mode.val
 #define xfs_panic_mask		xfs_params.panic_mask.val
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 81d7f24..80d2146 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -4465,9 +4465,6 @@ xlog_do_recover(
 	ASSERT(xfs_sb_good_version(sbp));
 	xfs_buf_relse(bp);
 
-	/* We've re-read the superblock so re-initialize per-cpu counters */
-	xfs_icsb_reinit_counters(log->l_mp);
-
 	xlog_recover_check_summary(log);
 
 	/* Normal transactions can now occur */
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 0e37248..07498f0 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -43,18 +43,6 @@
 #include "xfs_sysfs.h"
 
 
-#ifdef HAVE_PERCPU_SB
-STATIC void	xfs_icsb_balance_counter(xfs_mount_t *, xfs_sb_field_t,
-						int);
-STATIC void	xfs_icsb_balance_counter_locked(xfs_mount_t *, xfs_sb_field_t,
-						int);
-STATIC void	xfs_icsb_disable_counter(xfs_mount_t *, xfs_sb_field_t);
-#else
-
-#define xfs_icsb_balance_counter(mp, a, b)		do { } while (0)
-#define xfs_icsb_balance_counter_locked(mp, a, b)	do { } while (0)
-#endif
-
 static DEFINE_MUTEX(xfs_uuid_table_mutex);
 static int xfs_uuid_table_size;
 static uuid_t *xfs_uuid_table;
@@ -347,9 +335,6 @@ reread:
 		goto reread;
 	}
 
-	/* Initialize per-cpu counters */
-	xfs_icsb_reinit_counters(mp);
-
 	/* no need to be quiet anymore, so reset the buf ops */
 	bp->b_ops = &xfs_sb_buf_ops;
 
@@ -1087,8 +1072,6 @@ xfs_log_sbcount(xfs_mount_t *mp)
 	if (!xfs_fs_writable(mp, SB_FREEZE_COMPLETE))
 		return 0;
 
-	xfs_icsb_sync_counters(mp, 0);
-
 	/*
 	 * we don't need to do this if we are updating the superblock
 	 * counters on every modification.
@@ -1446,495 +1429,3 @@ xfs_dev_is_read_only(
 	}
 	return 0;
 }
-
-#ifdef HAVE_PERCPU_SB
-/*
- * Per-cpu incore superblock counters
- *
- * Simple concept, difficult implementation
- *
- * Basically, replace the incore superblock counters with a distributed per cpu
- * counter for contended fields (e.g.  free block count).
- *
- * Difficulties arise in that the incore sb is used for ENOSPC checking, and
- * hence needs to be accurately read when we are running low on space. Hence
- * there is a method to enable and disable the per-cpu counters based on how
- * much "stuff" is available in them.
- *
- * Basically, a counter is enabled if there is enough free resource to justify
- * running a per-cpu fast-path. If the per-cpu counter runs out (i.e. a local
- * ENOSPC), then we disable the counters to synchronise all callers and
- * re-distribute the available resources.
- *
- * If, once we redistributed the available resources, we still get a failure,
- * we disable the per-cpu counter and go through the slow path.
- *
- * The slow path is the current xfs_mod_incore_sb() function.  This means that
- * when we disable a per-cpu counter, we need to drain its resources back to
- * the global superblock. We do this after disabling the counter to prevent
- * more threads from queueing up on the counter.
- *
- * Essentially, this means that we still need a lock in the fast path to enable
- * synchronisation between the global counters and the per-cpu counters. This
- * is not a problem because the lock will be local to a CPU almost all the time
- * and have little contention except when we get to ENOSPC conditions.
- *
- * Basically, this lock becomes a barrier that enables us to lock out the fast
- * path while we do things like enabling and disabling counters and
- * synchronising the counters.
- *
- * Locking rules:
- *
- * 	1. m_sb_lock before picking up per-cpu locks
- * 	2. per-cpu locks always picked up via for_each_online_cpu() order
- * 	3. accurate counter sync requires m_sb_lock + per cpu locks
- * 	4. modifying per-cpu counters requires holding per-cpu lock
- * 	5. modifying global counters requires holding m_sb_lock
- *	6. enabling or disabling a counter requires holding the m_sb_lock 
- *	   and _none_ of the per-cpu locks.
- *
- * Disabled counters are only ever re-enabled by a balance operation
- * that results in more free resources per CPU than a given threshold.
- * To ensure counters don't remain disabled, they are rebalanced when
- * the global resource goes above a higher threshold (i.e. some hysteresis
- * is present to prevent thrashing).
- */
-
-#ifdef CONFIG_HOTPLUG_CPU
-/*
- * hot-plug CPU notifier support.
- *
- * We need a notifier per filesystem as we need to be able to identify
- * the filesystem to balance the counters out. This is achieved by
- * having a notifier block embedded in the xfs_mount_t and doing pointer
- * magic to get the mount pointer from the notifier block address.
- */
-STATIC int
-xfs_icsb_cpu_notify(
-	struct notifier_block *nfb,
-	unsigned long action,
-	void *hcpu)
-{
-	xfs_icsb_cnts_t *cntp;
-	xfs_mount_t	*mp;
-
-	mp = (xfs_mount_t *)container_of(nfb, xfs_mount_t, m_icsb_notifier);
-	cntp = (xfs_icsb_cnts_t *)
-			per_cpu_ptr(mp->m_sb_cnts, (unsigned long)hcpu);
-	switch (action) {
-	case CPU_UP_PREPARE:
-	case CPU_UP_PREPARE_FROZEN:
-		/* Easy Case - initialize the area and locks, and
-		 * then rebalance when online does everything else for us. */
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-		break;
-	case CPU_ONLINE:
-	case CPU_ONLINE_FROZEN:
-		xfs_icsb_lock(mp);
-		xfs_icsb_unlock(mp);
-		break;
-	case CPU_DEAD:
-	case CPU_DEAD_FROZEN:
-		/* Disable all the counters, then fold the dead cpu's
-		 * count into the total on the global superblock and
-		 * re-enable the counters. */
-		xfs_icsb_lock(mp);
-		spin_lock(&mp->m_sb_lock);
-
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-
-		spin_unlock(&mp->m_sb_lock);
-		xfs_icsb_unlock(mp);
-		break;
-	}
-
-	return NOTIFY_OK;
-}
-#endif /* CONFIG_HOTPLUG_CPU */
-
-int
-xfs_icsb_init_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	i = percpu_counter_init(&mp->m_sb.sb_icount, 0, GFP_KERNEL);
-	if (i)
-		return ENOMEM;
-
-	i = percpu_counter_init(&mp->m_sb.sb_ifree, 0, GFP_KERNEL);
-	if (i)
-		goto free_icount;
-
-	i = percpu_counter_init(&mp->m_sb.sb_fdblocks, 0, GFP_KERNEL);
-	if (i)
-		goto free_ifree;
-
-	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
-	if (!mp->m_sb_cnts)
-		goto free_fdblocks;
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
-	}
-
-	mutex_init(&mp->m_icsb_mutex);
-
-	/*
-	 * start with all counters disabled so that the
-	 * initial balance kicks us off correctly
-	 */
-	mp->m_icsb_counters = -1;
-
-#ifdef CONFIG_HOTPLUG_CPU
-	mp->m_icsb_notifier.notifier_call = xfs_icsb_cpu_notify;
-	mp->m_icsb_notifier.priority = 0;
-	register_hotcpu_notifier(&mp->m_icsb_notifier);
-#endif /* CONFIG_HOTPLUG_CPU */
-
-	return 0;
-
-free_fdblocks:
-	percpu_counter_destroy(&mp->m_sb.sb_fdblocks);
-free_ifree:
-	percpu_counter_destroy(&mp->m_sb.sb_ifree);
-free_icount:
-	percpu_counter_destroy(&mp->m_sb.sb_icount);
-	return -ENOMEM;
-}
-
-void
-xfs_icsb_reinit_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_lock(mp);
-	/*
-	 * start with all counters disabled so that the
-	 * initial balance kicks us off correctly
-	 */
-	mp->m_icsb_counters = -1;
-	xfs_icsb_unlock(mp);
-}
-
-void
-xfs_icsb_destroy_counters(
-	xfs_mount_t	*mp)
-{
-	if (mp->m_sb_cnts) {
-		unregister_hotcpu_notifier(&mp->m_icsb_notifier);
-		free_percpu(mp->m_sb_cnts);
-	}
-
-	percpu_counter_destroy(&mp->m_sb.sb_icount);
-	percpu_counter_destroy(&mp->m_sb.sb_ifree);
-	percpu_counter_destroy(&mp->m_sb.sb_fdblocks);
-
-	mutex_destroy(&mp->m_icsb_mutex);
-}
-
-STATIC void
-xfs_icsb_lock_cntr(
-	xfs_icsb_cnts_t	*icsbp)
-{
-	while (test_and_set_bit(XFS_ICSB_FLAG_LOCK, &icsbp->icsb_flags)) {
-		ndelay(1000);
-	}
-}
-
-STATIC void
-xfs_icsb_unlock_cntr(
-	xfs_icsb_cnts_t	*icsbp)
-{
-	clear_bit(XFS_ICSB_FLAG_LOCK, &icsbp->icsb_flags);
-}
-
-
-STATIC void
-xfs_icsb_lock_all_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		xfs_icsb_lock_cntr(cntp);
-	}
-}
-
-STATIC void
-xfs_icsb_unlock_all_counters(
-	xfs_mount_t	*mp)
-{
-	xfs_icsb_cnts_t *cntp;
-	int		i;
-
-	for_each_online_cpu(i) {
-		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
-		xfs_icsb_unlock_cntr(cntp);
-	}
-}
-
-STATIC void
-xfs_icsb_count(
-	xfs_mount_t	*mp,
-	xfs_icsb_cnts_t	*cnt,
-	int		flags)
-{
-	memset(cnt, 0, sizeof(xfs_icsb_cnts_t));
-
-	if (!(flags & XFS_ICSB_LAZY_COUNT))
-		xfs_icsb_lock_all_counters(mp);
-
-
-	if (!(flags & XFS_ICSB_LAZY_COUNT))
-		xfs_icsb_unlock_all_counters(mp);
-}
-
-STATIC int
-xfs_icsb_counter_disabled(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field)
-{
-	return test_bit(field, &mp->m_icsb_counters);
-}
-
-STATIC void
-xfs_icsb_disable_counter(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field)
-{
-	xfs_icsb_cnts_t	cnt;
-
-	/*
-	 * If we are already disabled, then there is nothing to do
-	 * here. We check before locking all the counters to avoid
-	 * the expensive lock operation when being called in the
-	 * slow path and the counter is already disabled. This is
-	 * safe because the only time we set or clear this state is under
-	 * the m_icsb_mutex.
-	 */
-	if (xfs_icsb_counter_disabled(mp, field))
-		return;
-
-	xfs_icsb_lock_all_counters(mp);
-	if (!test_and_set_bit(field, &mp->m_icsb_counters)) {
-		/* drain back to superblock */
-
-		xfs_icsb_count(mp, &cnt, XFS_ICSB_LAZY_COUNT);
-		switch(field) {
-		default:
-			BUG();
-		}
-	}
-
-	xfs_icsb_unlock_all_counters(mp);
-}
-
-STATIC void
-xfs_icsb_enable_counter(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field,
-	uint64_t	count,
-	uint64_t	resid)
-{
-	int		i;
-
-	xfs_icsb_lock_all_counters(mp);
-	for_each_online_cpu(i) {
-		switch (field) {
-		default:
-			BUG();
-			break;
-		}
-		resid = 0;
-	}
-	clear_bit(field, &mp->m_icsb_counters);
-	xfs_icsb_unlock_all_counters(mp);
-}
-
-void
-xfs_icsb_sync_counters_locked(
-	xfs_mount_t	*mp,
-	int		flags)
-{
-	xfs_icsb_cnts_t	cnt;
-
-	xfs_icsb_count(mp, &cnt, flags);
-}
-
-/*
- * Accurate update of per-cpu counters to incore superblock
- */
-void
-xfs_icsb_sync_counters(
-	xfs_mount_t	*mp,
-	int		flags)
-{
-	spin_lock(&mp->m_sb_lock);
-	xfs_icsb_sync_counters_locked(mp, flags);
-	spin_unlock(&mp->m_sb_lock);
-}
-
-/*
- * Balance and enable/disable counters as necessary.
- *
- * Thresholds for re-enabling counters are somewhat magic.  inode counts are
- * chosen to be the same number as single on disk allocation chunk per CPU, and
- * free blocks is something far enough zero that we aren't going thrash when we
- * get near ENOSPC. We also need to supply a minimum we require per cpu to
- * prevent looping endlessly when xfs_alloc_space asks for more than will
- * be distributed to a single CPU but each CPU has enough blocks to be
- * reenabled.
- *
- * Note that we can be called when counters are already disabled.
- * xfs_icsb_disable_counter() optimises the counter locking in this case to
- * prevent locking every per-cpu counter needlessly.
- */
-
-#define XFS_ICSB_INO_CNTR_REENABLE	(uint64_t)64
-#define XFS_ICSB_FDBLK_CNTR_REENABLE(mp) \
-		(uint64_t)(512 + XFS_ALLOC_SET_ASIDE(mp))
-STATIC void
-xfs_icsb_balance_counter_locked(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t  field,
-	int		min_per_cpu)
-{
-	uint64_t	count, resid;
-
-	/* disable counter and sync counter */
-	xfs_icsb_disable_counter(mp, field);
-
-	/* update counters  - first CPU gets residual*/
-	switch (field) {
-	default:
-		BUG();
-		count = resid = 0;	/* quiet, gcc */
-		break;
-	}
-
-	xfs_icsb_enable_counter(mp, field, count, resid);
-}
-
-STATIC void
-xfs_icsb_balance_counter(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t  fields,
-	int		min_per_cpu)
-{
-	spin_lock(&mp->m_sb_lock);
-	xfs_icsb_balance_counter_locked(mp, fields, min_per_cpu);
-	spin_unlock(&mp->m_sb_lock);
-}
-
-int
-xfs_icsb_modify_counters(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field,
-	int64_t		delta,
-	int		rsvd)
-{
-	xfs_icsb_cnts_t	*icsbp;
-	int		ret = 0;
-
-	might_sleep();
-again:
-	preempt_disable();
-	icsbp = this_cpu_ptr(mp->m_sb_cnts);
-
-	/*
-	 * if the counter is disabled, go to slow path
-	 */
-	if (unlikely(xfs_icsb_counter_disabled(mp, field)))
-		goto slow_path;
-	xfs_icsb_lock_cntr(icsbp);
-	if (unlikely(xfs_icsb_counter_disabled(mp, field))) {
-		xfs_icsb_unlock_cntr(icsbp);
-		goto slow_path;
-	}
-
-	switch (field) {
-	default:
-		BUG();
-		goto balance_counter; /* be still, gcc */
-	}
-	xfs_icsb_unlock_cntr(icsbp);
-	preempt_enable();
-	return 0;
-
-slow_path:
-	preempt_enable();
-
-	/*
-	 * serialise with a mutex so we don't burn lots of cpu on
-	 * the superblock lock. We still need to hold the superblock
-	 * lock, however, when we modify the global structures.
-	 */
-	xfs_icsb_lock(mp);
-
-	/*
-	 * Now running atomically.
-	 *
-	 * If the counter is enabled, someone has beaten us to rebalancing.
-	 * Drop the lock and try again in the fast path....
-	 */
-	if (!(xfs_icsb_counter_disabled(mp, field))) {
-		xfs_icsb_unlock(mp);
-		goto again;
-	}
-
-	/*
-	 * The counter is currently disabled. Because we are
-	 * running atomically here, we know a rebalance cannot
-	 * be in progress. Hence we can go straight to operating
-	 * on the global superblock. We do not call xfs_mod_incore_sb()
-	 * here even though we need to get the m_sb_lock. Doing so
-	 * will cause us to re-enter this function and deadlock.
-	 * Hence we get the m_sb_lock ourselves and then call
-	 * xfs_mod_incore_sb_unlocked() as the unlocked path operates
-	 * directly on the global counters.
-	 */
-	spin_lock(&mp->m_sb_lock);
-	ret = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
-	spin_unlock(&mp->m_sb_lock);
-
-	/*
-	 * Now that we've modified the global superblock, we
-	 * may be able to re-enable the distributed counters
-	 * (e.g. lots of space just got freed). After that
-	 * we are done.
-	 */
-	if (ret != -ENOSPC)
-		xfs_icsb_balance_counter(mp, field, 0);
-	xfs_icsb_unlock(mp);
-	return ret;
-
-balance_counter:
-	xfs_icsb_unlock_cntr(icsbp);
-	preempt_enable();
-
-	/*
-	 * We may have multiple threads here if multiple per-cpu
-	 * counters run dry at the same time. This will mean we can
-	 * do more balances than strictly necessary but it is not
-	 * the common slowpath case.
-	 */
-	xfs_icsb_lock(mp);
-
-	/*
-	 * running atomically.
-	 *
-	 * This will leave the counter in the correct state for future
-	 * accesses. After the rebalance, we simply try again and our retry
-	 * will either succeed through the fast path or slow path without
-	 * another balance operation being required.
-	 */
-	xfs_icsb_balance_counter(mp, field, delta);
-	xfs_icsb_unlock(mp);
-	goto again;
-}
-
-#endif
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 9499a8f..4e22e96 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -29,42 +29,8 @@ struct xfs_quotainfo;
 struct xfs_dir_ops;
 struct xfs_da_geometry;
 
-#ifdef HAVE_PERCPU_SB
-
-/*
- * Valid per-cpu incore superblock counters. Note that if you add new counters,
- * you may need to define new counter disabled bit field descriptors as there
- * are more possible fields in the superblock that can fit in a bitfield on a
- * 32 bit platform. The XFS_SBS_* values for the current current counters just
- * fit.
- */
-typedef struct xfs_icsb_cnts {
-	uint64_t	icsb_fdblocks;
-	uint64_t	icsb_ifree;
-	unsigned long	icsb_flags;
-} xfs_icsb_cnts_t;
-
-#define XFS_ICSB_FLAG_LOCK	(1 << 0)	/* counter lock bit */
-
-#define XFS_ICSB_LAZY_COUNT	(1 << 1)	/* accuracy not needed */
-
-extern int	xfs_icsb_init_counters(struct xfs_mount *);
-extern void	xfs_icsb_reinit_counters(struct xfs_mount *);
-extern void	xfs_icsb_destroy_counters(struct xfs_mount *);
-extern void	xfs_icsb_sync_counters(struct xfs_mount *, int);
-extern void	xfs_icsb_sync_counters_locked(struct xfs_mount *, int);
-extern int	xfs_icsb_modify_counters(struct xfs_mount *, xfs_sb_field_t,
-						int64_t, int);
-
-#else
-#define xfs_icsb_init_counters(mp)		(0)
-#define xfs_icsb_destroy_counters(mp)		do { } while (0)
-#define xfs_icsb_reinit_counters(mp)		do { } while (0)
-#define xfs_icsb_sync_counters(mp, flags)	do { } while (0)
-#define xfs_icsb_sync_counters_locked(mp, flags) do { } while (0)
-#define xfs_icsb_modify_counters(mp, field, delta, rsvd) \
-	xfs_mod_incore_sb(mp, field, delta, rsvd)
-#endif
+int	xfs_sb_init_percpu_counters(struct xfs_mount *);
+void	xfs_sb_destroy_percpu_counters(struct xfs_mount *);
 
 /* dynamic preallocation free space thresholds, 5% down to 1% */
 enum {
@@ -151,12 +117,6 @@ typedef struct xfs_mount {
 	const struct xfs_dir_ops *m_nondir_inode_ops; /* !dir inode ops */
 	uint			m_chsize;	/* size of next field */
 	atomic_t		m_active_trans;	/* number trans frozen */
-#ifdef HAVE_PERCPU_SB
-	xfs_icsb_cnts_t __percpu *m_sb_cnts;	/* per-cpu superblock counters */
-	unsigned long		m_icsb_counters; /* disabled per-cpu counters */
-	struct notifier_block	m_icsb_notifier; /* hotplug cpu notifier */
-	struct mutex		m_icsb_mutex;	/* balancer sync lock */
-#endif
 	struct xfs_mru_cache	*m_filestream;  /* per-mount filestream data */
 	struct delayed_work	m_reclaim_work;	/* background inode reclaim */
 	struct delayed_work	m_eofblocks_work; /* background eof blocks
@@ -289,26 +249,6 @@ xfs_daddr_to_agbno(struct xfs_mount *mp, xfs_daddr_t d)
 }
 
 /*
- * Per-cpu superblock locking functions
- */
-#ifdef HAVE_PERCPU_SB
-static inline void
-xfs_icsb_lock(xfs_mount_t *mp)
-{
-	mutex_lock(&mp->m_icsb_mutex);
-}
-
-static inline void
-xfs_icsb_unlock(xfs_mount_t *mp)
-{
-	mutex_unlock(&mp->m_icsb_mutex);
-}
-#else
-#define xfs_icsb_lock(mp)
-#define xfs_icsb_unlock(mp)
-#endif
-
-/*
  * This structure is for use by the xfs_mod_incore_sb_batch() routine.
  * xfs_growfs can specify a few fields which are more than int limit
  */
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 0fa688a..7a0bc92 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1035,23 +1035,6 @@ xfs_free_fsname(
 	kfree(mp->m_logname);
 }
 
-STATIC void
-xfs_fs_put_super(
-	struct super_block	*sb)
-{
-	struct xfs_mount	*mp = XFS_M(sb);
-
-	xfs_filestream_unmount(mp);
-	xfs_unmountfs(mp);
-
-	xfs_freesb(mp);
-	xfs_icsb_destroy_counters(mp);
-	xfs_destroy_mount_workqueues(mp);
-	xfs_close_devices(mp);
-	xfs_free_fsname(mp);
-	kfree(mp);
-}
-
 STATIC int
 xfs_fs_sync_fs(
 	struct super_block	*sb,
@@ -1100,8 +1083,6 @@ xfs_fs_statfs(
 	statp->f_fsid.val[0] = (u32)id;
 	statp->f_fsid.val[1] = (u32)(id >> 32);
 
-	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
-
 	sb_icount = percpu_counter_sum(&sbp->sb_icount);
 	sb_ifree = percpu_counter_sum(&sbp->sb_ifree);
 	sb_fdblocks = percpu_counter_sum(&sbp->sb_fdblocks);
@@ -1407,6 +1388,42 @@ xfs_finish_flags(
 	return 0;
 }
 
+static int
+xfs_init_percpu_counters(
+	struct xfs_mount	*mp)
+{
+	int		error;
+
+	error = percpu_counter_init(&mp->m_sb.sb_icount, 0, GFP_KERNEL);
+	if (error)
+		return ENOMEM;
+
+	error = percpu_counter_init(&mp->m_sb.sb_ifree, 0, GFP_KERNEL);
+	if (error)
+		goto free_icount;
+
+	error = percpu_counter_init(&mp->m_sb.sb_fdblocks, 0, GFP_KERNEL);
+	if (error)
+		goto free_ifree;
+
+	return 0;
+
+free_ifree:
+	percpu_counter_destroy(&mp->m_sb.sb_ifree);
+free_icount:
+	percpu_counter_destroy(&mp->m_sb.sb_icount);
+	return -ENOMEM;
+}
+
+static void
+xfs_destroy_percpu_counters(
+	struct xfs_mount	*mp)
+{
+	percpu_counter_destroy(&mp->m_sb.sb_icount);
+	percpu_counter_destroy(&mp->m_sb.sb_ifree);
+	percpu_counter_destroy(&mp->m_sb.sb_fdblocks);
+}
+
 STATIC int
 xfs_fs_fill_super(
 	struct super_block	*sb,
@@ -1455,7 +1472,7 @@ xfs_fs_fill_super(
 	if (error)
 		goto out_close_devices;
 
-	error = xfs_icsb_init_counters(mp);
+	error = xfs_init_percpu_counters(mp);
 	if (error)
 		goto out_destroy_workqueues;
 
@@ -1513,7 +1530,7 @@ xfs_fs_fill_super(
  out_free_sb:
 	xfs_freesb(mp);
  out_destroy_counters:
-	xfs_icsb_destroy_counters(mp);
+	xfs_destroy_percpu_counters(mp);
 out_destroy_workqueues:
 	xfs_destroy_mount_workqueues(mp);
  out_close_devices:
@@ -1530,6 +1547,23 @@ out_destroy_workqueues:
 	goto out_free_sb;
 }
 
+STATIC void
+xfs_fs_put_super(
+	struct super_block	*sb)
+{
+	struct xfs_mount	*mp = XFS_M(sb);
+
+	xfs_filestream_unmount(mp);
+	xfs_unmountfs(mp);
+
+	xfs_freesb(mp);
+	xfs_destroy_percpu_counters(mp);
+	xfs_destroy_mount_workqueues(mp);
+	xfs_close_devices(mp);
+	xfs_free_fsname(mp);
+	kfree(mp);
+}
+
 STATIC struct dentry *
 xfs_fs_mount(
 	struct file_system_type	*fs_type,
-- 
2.0.0

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format
  2015-02-01 21:42 ` [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format Dave Chinner
@ 2015-02-02  8:41   ` Christoph Hellwig
  2015-02-02 19:30     ` Dave Chinner
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2015-02-02  8:41 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

>  /*
> - * Superblock - in core version.  Must match the ondisk version below.
> - * Must be padded to 64 bit alignment.
> - */
> -typedef struct xfs_sb {
> -	__uint32_t	sb_magicnum;	/* magic number == XFS_SB_MAGIC */
> -	__uint32_t	sb_blocksize;	/* logical block size, bytes */

> -static inline int xfs_sb_version_hasfinobt(xfs_sb_t *sbp)
> +static inline int xfs_sb_version_hasfinobt(struct xfs_sb *sbp)

So xfs_format.h now requires struct xfs_sb to be defined before it
can be included?  I guess we need to move these macros around as well.

> diff --git a/fs/xfs/xfs_super.h b/fs/xfs/xfs_super.h
> index 2b830c2..a02236b 100644
> --- a/fs/xfs/xfs_super.h
> +++ b/fs/xfs/xfs_super.h
> @@ -61,6 +61,87 @@ struct xfs_mount;
>  struct xfs_buftarg;
>  struct block_device;
>  
> +/*
> + * Superblock - in core version.  This does not have ot match the size and shape
> + * of the on-disk superblock, but must contain all the fields that we use in the
> + * on-disk superblock.
> + */
> +struct xfs_sb {

Is this really the right header?  xfs_super.h only really is for bits
related to linux super block operastions.

I'd expect to move it close to stuct xfs_mount, and maybe even merge
it into that in the long run.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] xfs: use generic percpu counters for inode counter
  2015-02-01 21:43 ` [PATCH 2/5] xfs: use generic percpu counters for inode counter Dave Chinner
@ 2015-02-02 16:44   ` Christoph Hellwig
  2015-02-02 19:33     ` Dave Chinner
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2015-02-02 16:44 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

> diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> index 4cf335b..7bfa527 100644
> --- a/fs/xfs/libxfs/xfs_sb.c
> +++ b/fs/xfs/libxfs/xfs_sb.c
> @@ -357,7 +357,8 @@ __xfs_sb_from_disk(
>  	to->sb_rextslog = from->sb_rextslog;
>  	to->sb_inprogress = from->sb_inprogress;
>  	to->sb_imax_pct = from->sb_imax_pct;
> -	to->sb_icount = be64_to_cpu(from->sb_icount);
> +	if (percpu_counter_initialized(&to->sb_icount))
> +		percpu_counter_set(&to->sb_icount, be64_to_cpu(from->sb_icount));

Why would the percpu counter not be initialized here?  Oh, I guess
this is for xfs_sb_verify().  But why can't xfs_mount_validate_sb simply
operate on the disk endian SB to avoid that whole issue?

> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 6015f54..df5ec55 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1127,13 +1127,13 @@ xfs_mod_incore_sb_unlocked(
>  	 */
>  	switch (field) {
>  	case XFS_SBS_ICOUNT:
> +		/* deltas are +/-64, hence the large batch size of 128. */
> +		__percpu_counter_add(&mp->m_sb.sb_icount, delta, 128);
> +		if (percpu_counter_compare(&mp->m_sb.sb_icount, 0) < 0) {
>  			ASSERT(0);
> +			percpu_counter_add(&mp->m_sb.sb_icount, -delta);
>  			return -EINVAL;
>  		}
>  		return 0;
>  	case XFS_SBS_IFREE:
>  		lcounter = (long long)mp->m_sb.sb_ifree;
> @@ -1288,8 +1288,11 @@ xfs_mod_incore_sb(
>  	int			status;
>  
>  #ifdef HAVE_PERCPU_SB
> -	ASSERT(field < XFS_SBS_ICOUNT || field > XFS_SBS_FDBLOCKS);
> +	ASSERT(field < XFS_SBS_IFREE || field > XFS_SBS_FDBLOCKS);
>  #endif
> +	if (field == XFS_SBS_ICOUNT)
> +		return xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
> +

Why is this multiplexd through xfs_mod_incore_sb_unlocked while needing
a different locking context?  Shouldn't we simply use a different helper
for this case?

>  	xfs_icsb_cnts_t *cntp;
>  	int		i;
>  
> +	i = percpu_counter_init(&mp->m_sb.sb_icount, 0, GFP_KERNEL);
> +	if (i)
> +		return ENOMEM;
> +
>  	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
> -	if (mp->m_sb_cnts == NULL)
> +	if (!mp->m_sb_cnts) {
> +		percpu_counter_destroy(&mp->m_sb.sb_icount);
>  		return -ENOMEM;
> +	}
>  
>  	for_each_online_cpu(i) {

Reusing a variable for both an errno value and a loop iterator is
not very readable, just add an additional "error" variabe.

Also percpu_counter_init returns a proper egative errno value, no need
to turn that into the incorrect postive ENOMEM.

Additionally should this use goto unwining?

>  	if (idelta) {
> -		error = xfs_icsb_modify_counters(mp, XFS_SBS_ICOUNT,
> -						 idelta, rsvd);
> +		error = xfs_mod_incore_sb(mp, XFS_SBS_ICOUNT, idelta, rsvd);

Why go through xfs_mod_incore_sb here instead of directly jumping to
the function that does the work?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] xfs: use generic percpu counters for free block counter
  2015-02-01 21:43 ` [PATCH 4/5] xfs: use generic percpu counters for free block counter Dave Chinner
@ 2015-02-02 16:48   ` Christoph Hellwig
  2015-02-02 19:34     ` Dave Chinner
  2015-02-02 17:11   ` Brian Foster
  1 sibling, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2015-02-02 16:48 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

>  	case XFS_SBS_FDBLOCKS:
>  
>  		if (delta > 0) {		/* Putting blocks back */
> +			if (mp->m_resblks == mp->m_resblks_avail) {
> +				percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
> +				return 0;
> +			}
> +
> +			/* put blocks back into reserve pool first */
> +			spin_lock(&mp->m_sb_lock);
> +			res_used = (long long)
> +					(mp->m_resblks - mp->m_resblks_avail);
> +
>  			if (res_used > delta) {
>  				mp->m_resblks_avail += delta;
>  			} else {
> +				delta -= res_used;
>  				mp->m_resblks_avail = mp->m_resblks;
> +				percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
>  			}
> +			spin_unlock(&mp->m_sb_lock);
> +			return 0;
>  
> +		}

>  
> +		/*
> +		 * Taking blocks away, need to be more accurate the closer we
> +		 * are to zero.
> +		 *
> +		 * batch size is set to a maximum of 1024 blocks - if we are
> +		 * allocating of freeing extents larger than this then we aren't
> +		 * going to be hammering the counter lock so a lock per update
> +		 * is not a problem.
> +		 *
> +		 * If the counter has a value of less than 2 * max batch size,
> +		 * then make everything serialise as we are real close to
> +		 * ENOSPC.
> +		 */
> +#define __BATCH	1024
> +		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
> +					   2 * __BATCH) < 0)
> +			batch = 1;
> +		else
> +			batch = __BATCH;
> +
> +		__percpu_counter_add(&mp->m_sb.sb_fdblocks, delta, batch);
> +		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
> +					   XFS_ALLOC_SET_ASIDE(mp)) >= 0) {
> +			/* we had space! */
> +			return 0;
>  		}
>  
> +		/*
> +		 * lock up the sb for dipping into reserves before releasing
> +		 * the space that took us to ENOSPC.
> +		 */
> +		spin_lock(&mp->m_sb_lock);
> +		percpu_counter_add(&mp->m_sb.sb_fdblocks, -delta);
> +		if (!rsvd)
> +			goto fdblocks_enospc;
> +
> +		lcounter = (long long)mp->m_resblks_avail + delta;
> +		if (lcounter >= 0) {
> +			mp->m_resblks_avail = lcounter;
> +			spin_unlock(&mp->m_sb_lock);
> +			return 0;
> +		}
> +		printk_once(KERN_WARNING
> +			"Filesystem \"%s\": reserve blocks depleted! "
> +			"Consider increasing reserve pool size.",
> +			mp->m_fsname);
> +fdblocks_enospc:
> +		spin_unlock(&mp->m_sb_lock);
> +		return -ENOSPC;
> +

This screams for having two different helpers for removing vs adding back
reserved blocks.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 3/5] xfs: use generic percpu counters for free inode counter
  2015-02-01 21:43 ` [PATCH 3/5] xfs: use generic percpu counters for free " Dave Chinner
@ 2015-02-02 17:10   ` Brian Foster
  0 siblings, 0 replies; 24+ messages in thread
From: Brian Foster @ 2015-02-02 17:10 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Mon, Feb 02, 2015 at 08:43:01AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> XFS has hand-rolled per-cpu counters for the superblock since before
> there was any generic implementation. The free inode counter is not
> used for any limit enforcement - the per-AG free inode counters are
> used during allocation to determine if there are inode available for
> allocation.
> 
> Hence we don't need any of the complexity of the hand-rolled
> counters and we can simply replace them with generic per-cpu
> counters similar to the inode counter.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_sb.c |  8 ++++---
>  fs/xfs/xfs_fsops.c     |  2 +-
>  fs/xfs/xfs_mount.c     | 62 +++++++++++++++++---------------------------------
>  fs/xfs/xfs_super.c     |  4 +++-
>  fs/xfs/xfs_super.h     |  2 +-
>  fs/xfs/xfs_trans.c     |  5 ++--
>  6 files changed, 33 insertions(+), 50 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> index 7bfa527..42e5c89 100644
> --- a/fs/xfs/libxfs/xfs_sb.c
> +++ b/fs/xfs/libxfs/xfs_sb.c
> @@ -359,7 +359,8 @@ __xfs_sb_from_disk(
>  	to->sb_imax_pct = from->sb_imax_pct;
>  	if (percpu_counter_initialized(&to->sb_icount))
>  		percpu_counter_set(&to->sb_icount, be64_to_cpu(from->sb_icount));
> -	to->sb_ifree = be64_to_cpu(from->sb_ifree);
> +	if (percpu_counter_initialized(&to->sb_icount))

					sb_ifree

Brian

> +		percpu_counter_set(&to->sb_ifree, be64_to_cpu(from->sb_ifree));
>  	to->sb_fdblocks = be64_to_cpu(from->sb_fdblocks);
>  	to->sb_frextents = be64_to_cpu(from->sb_frextents);
>  	to->sb_uquotino = be64_to_cpu(from->sb_uquotino);
> @@ -494,7 +495,7 @@ xfs_sb_to_disk(
>  	to->sb_inprogress = from->sb_inprogress;
>  	to->sb_imax_pct = from->sb_imax_pct;
>  	to->sb_icount = cpu_to_be64(percpu_counter_sum(&from->sb_icount));
> -	to->sb_ifree = cpu_to_be64(from->sb_ifree);
> +	to->sb_ifree = cpu_to_be64(percpu_counter_sum(&from->sb_ifree));
>  	to->sb_fdblocks = cpu_to_be64(from->sb_fdblocks);
>  	to->sb_frextents = cpu_to_be64(from->sb_frextents);
>  
> @@ -540,6 +541,7 @@ xfs_sb_verify(
>  
>  	/* don't need to validate icount here */
>  	sb.sb_icount.counters = NULL;
> +	sb.sb_ifree.counters = NULL;
>  
>  	/*
>  	 * Use call variant which doesn't convert quota flags from disk 
> @@ -751,7 +753,7 @@ xfs_initialize_perag_data(
>  	 * Overwrite incore superblock counters with just-read data
>  	 */
>  	spin_lock(&mp->m_sb_lock);
> -	sbp->sb_ifree = ifree;
> +	percpu_counter_set(&sbp->sb_ifree, ifree);
>  	percpu_counter_set(&sbp->sb_icount, ialloc);
>  	sbp->sb_fdblocks = bfree + bfreelst + btree;
>  	spin_unlock(&mp->m_sb_lock);
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 9cc34d2..619a9f3 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -632,11 +632,11 @@ xfs_fs_counts(
>  {
>  	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
>  	cnt->allocino = percpu_counter_read_positive(&mp->m_sb.sb_icount);
> +	cnt->freeino = percpu_counter_read_positive(&mp->m_sb.sb_ifree);
>  
>  	spin_lock(&mp->m_sb_lock);
>  	cnt->freedata = mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
>  	cnt->freertx = mp->m_sb.sb_frextents;
> -	cnt->freeino = mp->m_sb.sb_ifree;
>  	spin_unlock(&mp->m_sb_lock);
>  	return 0;
>  }
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index df5ec55..8e8924f 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1136,13 +1136,12 @@ xfs_mod_incore_sb_unlocked(
>  		}
>  		return 0;
>  	case XFS_SBS_IFREE:
> -		lcounter = (long long)mp->m_sb.sb_ifree;
> -		lcounter += delta;
> -		if (lcounter < 0) {
> +		percpu_counter_add(&mp->m_sb.sb_ifree, delta);
> +		if (percpu_counter_compare(&mp->m_sb.sb_ifree, 0) < 0) {
>  			ASSERT(0);
> +			percpu_counter_add(&mp->m_sb.sb_ifree, -delta);
>  			return -EINVAL;
>  		}
> -		mp->m_sb.sb_ifree = lcounter;
>  		return 0;
>  	case XFS_SBS_FDBLOCKS:
>  		lcounter = (long long)
> @@ -1288,9 +1287,9 @@ xfs_mod_incore_sb(
>  	int			status;
>  
>  #ifdef HAVE_PERCPU_SB
> -	ASSERT(field < XFS_SBS_IFREE || field > XFS_SBS_FDBLOCKS);
> +	ASSERT(field != XFS_SBS_FDBLOCKS);
>  #endif
> -	if (field == XFS_SBS_ICOUNT)
> +	if (field == XFS_SBS_ICOUNT || field == XFS_SBS_IFREE)
>  		return xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
>  
>  	spin_lock(&mp->m_sb_lock);
> @@ -1495,7 +1494,6 @@ xfs_icsb_cpu_notify(
>  	case CPU_ONLINE:
>  	case CPU_ONLINE_FROZEN:
>  		xfs_icsb_lock(mp);
> -		xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
>  		xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
>  		xfs_icsb_unlock(mp);
>  		break;
> @@ -1506,15 +1504,12 @@ xfs_icsb_cpu_notify(
>  		 * re-enable the counters. */
>  		xfs_icsb_lock(mp);
>  		spin_lock(&mp->m_sb_lock);
> -		xfs_icsb_disable_counter(mp, XFS_SBS_IFREE);
>  		xfs_icsb_disable_counter(mp, XFS_SBS_FDBLOCKS);
>  
> -		mp->m_sb.sb_ifree += cntp->icsb_ifree;
>  		mp->m_sb.sb_fdblocks += cntp->icsb_fdblocks;
>  
>  		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
>  
> -		xfs_icsb_balance_counter_locked(mp, XFS_SBS_IFREE, 0);
>  		xfs_icsb_balance_counter_locked(mp, XFS_SBS_FDBLOCKS, 0);
>  		spin_unlock(&mp->m_sb_lock);
>  		xfs_icsb_unlock(mp);
> @@ -1536,11 +1531,13 @@ xfs_icsb_init_counters(
>  	if (i)
>  		return ENOMEM;
>  
> +	i = percpu_counter_init(&mp->m_sb.sb_ifree, 0, GFP_KERNEL);
> +	if (i)
> +		goto free_icount;
> +
>  	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
> -	if (!mp->m_sb_cnts) {
> -		percpu_counter_destroy(&mp->m_sb.sb_icount);
> -		return -ENOMEM;
> -	}
> +	if (!mp->m_sb_cnts)
> +		goto free_ifree;
>  
>  	for_each_online_cpu(i) {
>  		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
> @@ -1562,6 +1559,12 @@ xfs_icsb_init_counters(
>  #endif /* CONFIG_HOTPLUG_CPU */
>  
>  	return 0;
> +
> +free_ifree:
> +	percpu_counter_destroy(&mp->m_sb.sb_ifree);
> +free_icount:
> +	percpu_counter_destroy(&mp->m_sb.sb_icount);
> +	return -ENOMEM;
>  }
>  
>  void
> @@ -1574,7 +1577,6 @@ xfs_icsb_reinit_counters(
>  	 * initial balance kicks us off correctly
>  	 */
>  	mp->m_icsb_counters = -1;
> -	xfs_icsb_balance_counter(mp, XFS_SBS_IFREE, 0);
>  	xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
>  	xfs_icsb_unlock(mp);
>  }
> @@ -1589,6 +1591,7 @@ xfs_icsb_destroy_counters(
>  	}
>  
>  	percpu_counter_destroy(&mp->m_sb.sb_icount);
> +	percpu_counter_destroy(&mp->m_sb.sb_ifree);
>  
>  	mutex_destroy(&mp->m_icsb_mutex);
>  }
> @@ -1652,7 +1655,6 @@ xfs_icsb_count(
>  
>  	for_each_online_cpu(i) {
>  		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
> -		cnt->icsb_ifree += cntp->icsb_ifree;
>  		cnt->icsb_fdblocks += cntp->icsb_fdblocks;
>  	}
>  
> @@ -1665,7 +1667,7 @@ xfs_icsb_counter_disabled(
>  	xfs_mount_t	*mp,
>  	xfs_sb_field_t	field)
>  {
> -	ASSERT((field >= XFS_SBS_IFREE) && (field <= XFS_SBS_FDBLOCKS));
> +	ASSERT(field == XFS_SBS_FDBLOCKS);
>  	return test_bit(field, &mp->m_icsb_counters);
>  }
>  
> @@ -1676,7 +1678,7 @@ xfs_icsb_disable_counter(
>  {
>  	xfs_icsb_cnts_t	cnt;
>  
> -	ASSERT((field >= XFS_SBS_IFREE) && (field <= XFS_SBS_FDBLOCKS));
> +	ASSERT(field == XFS_SBS_FDBLOCKS);
>  
>  	/*
>  	 * If we are already disabled, then there is nothing to do
> @@ -1695,9 +1697,6 @@ xfs_icsb_disable_counter(
>  
>  		xfs_icsb_count(mp, &cnt, XFS_ICSB_LAZY_COUNT);
>  		switch(field) {
> -		case XFS_SBS_IFREE:
> -			mp->m_sb.sb_ifree = cnt.icsb_ifree;
> -			break;
>  		case XFS_SBS_FDBLOCKS:
>  			mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
>  			break;
> @@ -1719,15 +1718,12 @@ xfs_icsb_enable_counter(
>  	xfs_icsb_cnts_t	*cntp;
>  	int		i;
>  
> -	ASSERT((field >= XFS_SBS_IFREE) && (field <= XFS_SBS_FDBLOCKS));
> +	ASSERT(field == XFS_SBS_FDBLOCKS);
>  
>  	xfs_icsb_lock_all_counters(mp);
>  	for_each_online_cpu(i) {
>  		cntp = per_cpu_ptr(mp->m_sb_cnts, i);
>  		switch (field) {
> -		case XFS_SBS_IFREE:
> -			cntp->icsb_ifree = count + resid;
> -			break;
>  		case XFS_SBS_FDBLOCKS:
>  			cntp->icsb_fdblocks = count + resid;
>  			break;
> @@ -1750,8 +1746,6 @@ xfs_icsb_sync_counters_locked(
>  
>  	xfs_icsb_count(mp, &cnt, flags);
>  
> -	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_IFREE))
> -		mp->m_sb.sb_ifree = cnt.icsb_ifree;
>  	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_FDBLOCKS))
>  		mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
>  }
> @@ -1803,12 +1797,6 @@ xfs_icsb_balance_counter_locked(
>  
>  	/* update counters  - first CPU gets residual*/
>  	switch (field) {
> -	case XFS_SBS_IFREE:
> -		count = mp->m_sb.sb_ifree;
> -		resid = do_div(count, weight);
> -		if (count < max(min, XFS_ICSB_INO_CNTR_REENABLE))
> -			return;
> -		break;
>  	case XFS_SBS_FDBLOCKS:
>  		count = mp->m_sb.sb_fdblocks;
>  		resid = do_div(count, weight);
> @@ -1863,14 +1851,6 @@ again:
>  	}
>  
>  	switch (field) {
> -	case XFS_SBS_IFREE:
> -		lcounter = icsbp->icsb_ifree;
> -		lcounter += delta;
> -		if (unlikely(lcounter < 0))
> -			goto balance_counter;
> -		icsbp->icsb_ifree = lcounter;
> -		break;
> -
>  	case XFS_SBS_FDBLOCKS:
>  		BUG_ON((mp->m_resblks - mp->m_resblks_avail) != 0);
>  
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 408e2fe..c17bfa4 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1088,6 +1088,7 @@ xfs_fs_statfs(
>  	struct xfs_inode	*ip = XFS_I(dentry->d_inode);
>  	__uint64_t		fakeinos, id;
>  	__uint64_t		sb_icount;
> +	__uint64_t		sb_ifree;
>  	xfs_extlen_t		lsize;
>  	__int64_t		ffree;
>  
> @@ -1100,6 +1101,7 @@ xfs_fs_statfs(
>  
>  	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
>  	sb_icount = percpu_counter_sum(&sbp->sb_icount);
> +	sb_ifree = percpu_counter_sum(&sbp->sb_ifree);
>  
>  	spin_lock(&mp->m_sb_lock);
>  	statp->f_bsize = sbp->sb_blocksize;
> @@ -1116,7 +1118,7 @@ xfs_fs_statfs(
>  					mp->m_maxicount);
>  
>  	/* make sure statp->f_ffree does not underflow */
> -	ffree = statp->f_files - (sb_icount - sbp->sb_ifree);
> +	ffree = statp->f_files - (sb_icount - sb_ifree);
>  	statp->f_ffree = max_t(__int64_t, ffree, 0);
>  
>  	spin_unlock(&mp->m_sb_lock);
> diff --git a/fs/xfs/xfs_super.h b/fs/xfs/xfs_super.h
> index fa5603c..6efc7a2 100644
> --- a/fs/xfs/xfs_super.h
> +++ b/fs/xfs/xfs_super.h
> @@ -98,8 +98,8 @@ struct xfs_sb {
>  					/* statistics */
>  
>  	struct percpu_counter	sb_icount;	/* allocated inodes */
> +	struct percpu_counter	sb_ifree;	/* free inodes */
>  
> -	__uint64_t	sb_ifree;	/* free inodes */
>  	__uint64_t	sb_fdblocks;	/* free data blocks */
>  	__uint64_t	sb_frextents;	/* free realtime extents */
>  	xfs_ino_t	sb_uquotino;	/* user quota inode */
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index d78b0ae..c54d4b7 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -560,8 +560,7 @@ xfs_trans_unreserve_and_mod_sb(
>  	}
>  
>  	if (ifreedelta) {
> -		error = xfs_icsb_modify_counters(mp, XFS_SBS_IFREE,
> -						 ifreedelta, rsvd);
> +		error = xfs_mod_incore_sb(mp, XFS_SBS_IFREE, ifreedelta, rsvd);
>  		if (error)
>  			goto out_undo_icount;
>  	}
> @@ -630,7 +629,7 @@ xfs_trans_unreserve_and_mod_sb(
>  
>  out_undo_ifreecount:
>  	if (ifreedelta)
> -		xfs_icsb_modify_counters(mp, XFS_SBS_IFREE, -ifreedelta, rsvd);
> +		xfs_mod_incore_sb(mp, XFS_SBS_IFREE, -ifreedelta, rsvd);
>  out_undo_icount:
>  	if (idelta)
>  		xfs_mod_incore_sb(mp, XFS_SBS_ICOUNT, -idelta, rsvd);
> -- 
> 2.0.0
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 5/5] xfs: Remove icsb infrastructure
  2015-02-01 21:43 ` [PATCH 5/5] xfs: Remove icsb infrastructure Dave Chinner
@ 2015-02-02 17:11   ` Brian Foster
  0 siblings, 0 replies; 24+ messages in thread
From: Brian Foster @ 2015-02-02 17:11 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Mon, Feb 02, 2015 at 08:43:03AM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Now that the in-cor superblock infrastructure has been replaced with
> generic per-cpu counters, we don't need it anymore. Nuke it from
> orbit so we are sure that it won't haunt us again...
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_bmap.c |  16 +-
>  fs/xfs/libxfs/xfs_sb.c   |  10 +-
>  fs/xfs/xfs_fsops.c       |   2 -
>  fs/xfs/xfs_iomap.c       |   1 -
>  fs/xfs/xfs_linux.h       |   9 -
>  fs/xfs/xfs_log_recover.c |   3 -
>  fs/xfs/xfs_mount.c       | 509 -----------------------------------------------
>  fs/xfs/xfs_mount.h       |  64 +-----
>  fs/xfs/xfs_super.c       |  76 +++++--
>  9 files changed, 67 insertions(+), 623 deletions(-)
> 
...
> +static int
> +xfs_init_percpu_counters(
> +	struct xfs_mount	*mp)
> +{
> +	int		error;
> +
> +	error = percpu_counter_init(&mp->m_sb.sb_icount, 0, GFP_KERNEL);
> +	if (error)
> +		return ENOMEM;

			-ENOMEM

Brian

> +
> +	error = percpu_counter_init(&mp->m_sb.sb_ifree, 0, GFP_KERNEL);
> +	if (error)
> +		goto free_icount;
> +
> +	error = percpu_counter_init(&mp->m_sb.sb_fdblocks, 0, GFP_KERNEL);
> +	if (error)
> +		goto free_ifree;
> +
> +	return 0;
> +
> +free_ifree:
> +	percpu_counter_destroy(&mp->m_sb.sb_ifree);
> +free_icount:
> +	percpu_counter_destroy(&mp->m_sb.sb_icount);
> +	return -ENOMEM;
> +}
> +
> +static void
> +xfs_destroy_percpu_counters(
> +	struct xfs_mount	*mp)
> +{
> +	percpu_counter_destroy(&mp->m_sb.sb_icount);
> +	percpu_counter_destroy(&mp->m_sb.sb_ifree);
> +	percpu_counter_destroy(&mp->m_sb.sb_fdblocks);
> +}
> +
>  STATIC int
>  xfs_fs_fill_super(
>  	struct super_block	*sb,
> @@ -1455,7 +1472,7 @@ xfs_fs_fill_super(
>  	if (error)
>  		goto out_close_devices;
>  
> -	error = xfs_icsb_init_counters(mp);
> +	error = xfs_init_percpu_counters(mp);
>  	if (error)
>  		goto out_destroy_workqueues;
>  
> @@ -1513,7 +1530,7 @@ xfs_fs_fill_super(
>   out_free_sb:
>  	xfs_freesb(mp);
>   out_destroy_counters:
> -	xfs_icsb_destroy_counters(mp);
> +	xfs_destroy_percpu_counters(mp);
>  out_destroy_workqueues:
>  	xfs_destroy_mount_workqueues(mp);
>   out_close_devices:
> @@ -1530,6 +1547,23 @@ out_destroy_workqueues:
>  	goto out_free_sb;
>  }
>  
> +STATIC void
> +xfs_fs_put_super(
> +	struct super_block	*sb)
> +{
> +	struct xfs_mount	*mp = XFS_M(sb);
> +
> +	xfs_filestream_unmount(mp);
> +	xfs_unmountfs(mp);
> +
> +	xfs_freesb(mp);
> +	xfs_destroy_percpu_counters(mp);
> +	xfs_destroy_mount_workqueues(mp);
> +	xfs_close_devices(mp);
> +	xfs_free_fsname(mp);
> +	kfree(mp);
> +}
> +
>  STATIC struct dentry *
>  xfs_fs_mount(
>  	struct file_system_type	*fs_type,
> -- 
> 2.0.0
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] xfs: use generic percpu counters for free block counter
  2015-02-01 21:43 ` [PATCH 4/5] xfs: use generic percpu counters for free block counter Dave Chinner
  2015-02-02 16:48   ` Christoph Hellwig
@ 2015-02-02 17:11   ` Brian Foster
  2015-02-02 19:39     ` Dave Chinner
  1 sibling, 1 reply; 24+ messages in thread
From: Brian Foster @ 2015-02-02 17:11 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Mon, Feb 02, 2015 at 08:43:02AM +1100, Dave Chinner wrote:
> XFS has hand-rolled per-cpu counters for the superblock since before
> there was any generic implementation. The free block counter is
> special in that it is used for ENOSPC detection outside transaction
> contexts for for delayed allocation. This means that the counter
> needs to be accurate at zero. The current per-cpu counter code jumps
> through lots of hoops to ensure we never run past zero, but we don't
> need to make all those jumps with the generic counter
> implementation.
> 
> The generic counter implementation allows us to pass a "batch"
> threshold at which the addition/subtraction to the counter value
> will be folded back into global value under lock. We can use this
> feature to reduce the batch size as we approach 0 in a very similar
> manner to the existing counters and their rebalance algorithm. If we
> use a batch size of 1 as we approach 0, then every addition and
> subtraction will be done against the global value and hence allow
> accurate detection of zero threshold crossing.
> 
> Hence we can replace the handrolled, accurate-at-zero counters with
> generic percpu counters.
> 
> Note: this removes just enough of the icsb infrastructure to compile
> without warnings. The rest will go in subsequent commits.
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> ---
>  fs/xfs/libxfs/xfs_sb.c |  11 +++-
>  fs/xfs/xfs_fsops.c     |   9 +--
>  fs/xfs/xfs_iomap.c     |   2 +-
>  fs/xfs/xfs_mount.c     | 167 ++++++++++++++++++++++++-------------------------
>  fs/xfs/xfs_super.c     |  11 +++-
>  fs/xfs/xfs_super.h     |   2 +-
>  fs/xfs/xfs_trans.c     |   9 ++-
>  7 files changed, 109 insertions(+), 102 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> index 42e5c89..bdde5c7 100644
> --- a/fs/xfs/libxfs/xfs_sb.c
> +++ b/fs/xfs/libxfs/xfs_sb.c
> @@ -357,11 +357,15 @@ __xfs_sb_from_disk(
>  	to->sb_rextslog = from->sb_rextslog;
>  	to->sb_inprogress = from->sb_inprogress;
>  	to->sb_imax_pct = from->sb_imax_pct;
> +
>  	if (percpu_counter_initialized(&to->sb_icount))
>  		percpu_counter_set(&to->sb_icount, be64_to_cpu(from->sb_icount));
>  	if (percpu_counter_initialized(&to->sb_icount))
>  		percpu_counter_set(&to->sb_ifree, be64_to_cpu(from->sb_ifree));
> -	to->sb_fdblocks = be64_to_cpu(from->sb_fdblocks);
> +	if (percpu_counter_initialized(&to->sb_fdblocks))
> +		percpu_counter_set(&to->sb_fdblocks,
> +				    be64_to_cpu(from->sb_fdblocks));
> +
>  	to->sb_frextents = be64_to_cpu(from->sb_frextents);
>  	to->sb_uquotino = be64_to_cpu(from->sb_uquotino);
>  	to->sb_gquotino = be64_to_cpu(from->sb_gquotino);
> @@ -496,7 +500,7 @@ xfs_sb_to_disk(
>  	to->sb_imax_pct = from->sb_imax_pct;
>  	to->sb_icount = cpu_to_be64(percpu_counter_sum(&from->sb_icount));
>  	to->sb_ifree = cpu_to_be64(percpu_counter_sum(&from->sb_ifree));
> -	to->sb_fdblocks = cpu_to_be64(from->sb_fdblocks);
> +	to->sb_fdblocks = cpu_to_be64(percpu_counter_sum(&from->sb_fdblocks));
>  	to->sb_frextents = cpu_to_be64(from->sb_frextents);
>  
>  	to->sb_flags = from->sb_flags;
> @@ -542,6 +546,7 @@ xfs_sb_verify(
>  	/* don't need to validate icount here */
>  	sb.sb_icount.counters = NULL;
>  	sb.sb_ifree.counters = NULL;
> +	sb.sb_fdblocks.counters = NULL;
>  
>  	/*
>  	 * Use call variant which doesn't convert quota flags from disk 
> @@ -755,7 +760,7 @@ xfs_initialize_perag_data(
>  	spin_lock(&mp->m_sb_lock);
>  	percpu_counter_set(&sbp->sb_ifree, ifree);
>  	percpu_counter_set(&sbp->sb_icount, ialloc);
> -	sbp->sb_fdblocks = bfree + bfreelst + btree;
> +	percpu_counter_set(&sbp->sb_fdblocks, bfree + bfreelst + btree);
>  	spin_unlock(&mp->m_sb_lock);
>  
>  	/* Fixup the per-cpu counters as well. */
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 619a9f3..ccb00cd 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -633,9 +633,10 @@ xfs_fs_counts(
>  	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
>  	cnt->allocino = percpu_counter_read_positive(&mp->m_sb.sb_icount);
>  	cnt->freeino = percpu_counter_read_positive(&mp->m_sb.sb_ifree);
> +	cnt->freedata = percpu_counter_read_positive(&mp->m_sb.sb_fdblocks) -
> +							XFS_ALLOC_SET_ASIDE(mp);
>  
>  	spin_lock(&mp->m_sb_lock);
> -	cnt->freedata = mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
>  	cnt->freertx = mp->m_sb.sb_frextents;
>  	spin_unlock(&mp->m_sb_lock);
>  	return 0;
> @@ -710,7 +711,8 @@ retry:
>  	} else {
>  		__int64_t	free;
>  
> -		free =  mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
> +		free = percpu_counter_sum(&mp->m_sb.sb_fdblocks) -
> +							XFS_ALLOC_SET_ASIDE(mp);
>  		if (!free)
>  			goto out; /* ENOSPC and fdblks_delta = 0 */
>  
> @@ -749,8 +751,7 @@ out:
>  		 * the extra reserve blocks from the reserve.....
>  		 */
>  		int error;
> -		error = xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
> -						 fdblks_delta, 0);
> +		error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, fdblks_delta, 0);
>  		if (error == -ENOSPC)
>  			goto retry;
>  	}
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index ccb1dd0..310433a 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -461,7 +461,7 @@ xfs_iomap_prealloc_size(
>  				       alloc_blocks);
>  
>  	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
> -	freesp = mp->m_sb.sb_fdblocks;
> +	freesp = percpu_counter_read_positive(&mp->m_sb.sb_fdblocks);
>  	if (freesp < mp->m_low_space[XFS_LOWSP_5_PCNT]) {
>  		shift = 2;
>  		if (freesp < mp->m_low_space[XFS_LOWSP_4_PCNT])
> diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
> index 8e8924f..0e37248 100644
> --- a/fs/xfs/xfs_mount.c
> +++ b/fs/xfs/xfs_mount.c
> @@ -1117,7 +1117,8 @@ xfs_mod_incore_sb_unlocked(
>  {
>  	int		scounter;	/* short counter for 32 bit fields */
>  	long long	lcounter;	/* long counter for 64 bit fields */
> -	long long	res_used, rem;
> +	long long	res_used;
> +	s32		batch;
>  
>  	/*
>  	 * With the in-core superblock spin lock held, switch
> @@ -1144,47 +1145,80 @@ xfs_mod_incore_sb_unlocked(
>  		}
>  		return 0;
>  	case XFS_SBS_FDBLOCKS:
> -		lcounter = (long long)
> -			mp->m_sb.sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
> -		res_used = (long long)(mp->m_resblks - mp->m_resblks_avail);
>  
>  		if (delta > 0) {		/* Putting blocks back */
> +			if (mp->m_resblks == mp->m_resblks_avail) {
> +				percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
> +				return 0;
> +			}
> +
> +			/* put blocks back into reserve pool first */
> +			spin_lock(&mp->m_sb_lock);
> +			res_used = (long long)
> +					(mp->m_resblks - mp->m_resblks_avail);
> +
>  			if (res_used > delta) {
>  				mp->m_resblks_avail += delta;
>  			} else {
> -				rem = delta - res_used;
> +				delta -= res_used;
>  				mp->m_resblks_avail = mp->m_resblks;
> -				lcounter += rem;
> -			}
> -		} else {				/* Taking blocks away */
> -			lcounter += delta;
> -			if (lcounter >= 0) {
> -				mp->m_sb.sb_fdblocks = lcounter +
> -							XFS_ALLOC_SET_ASIDE(mp);
> -				return 0;
> +				percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
>  			}
> +			spin_unlock(&mp->m_sb_lock);
> +			return 0;
>  
> -			/*
> -			 * We are out of blocks, use any available reserved
> -			 * blocks if were allowed to.
> -			 */
> -			if (!rsvd)
> -				return -ENOSPC;
> +		}
>  
> -			lcounter = (long long)mp->m_resblks_avail + delta;
> -			if (lcounter >= 0) {
> -				mp->m_resblks_avail = lcounter;
> -				return 0;
> -			}
> -			printk_once(KERN_WARNING
> -				"Filesystem \"%s\": reserve blocks depleted! "
> -				"Consider increasing reserve pool size.",
> -				mp->m_fsname);
> -			return -ENOSPC;
> +		/*
> +		 * Taking blocks away, need to be more accurate the closer we
> +		 * are to zero.
> +		 *
> +		 * batch size is set to a maximum of 1024 blocks - if we are
> +		 * allocating of freeing extents larger than this then we aren't
> +		 * going to be hammering the counter lock so a lock per update
> +		 * is not a problem.
> +		 *

IIUC, the batch size determines the point at which the local cpu delta
is folded back into the global counter (under lock). If we're allocating
large extents, these will surpass the batch size and result in a locked
update. Smaller updates are aggregated into the cpu counter and folded
in at some later time.

> +		 * If the counter has a value of less than 2 * max batch size,
> +		 * then make everything serialise as we are real close to
> +		 * ENOSPC.
> +		 */
> +#define __BATCH	1024
> +		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
> +					   2 * __BATCH) < 0)
> +			batch = 1;
> +		else
> +			batch = __BATCH;
> +

The general approach seems logical. I do wonder whether blocks is the
right scale as opposed to block count normalized against some fixed I/O
size (to account for different block sizes).

Also, it seems like speculative preallocation could hide some of the
overhead here, depending on workload of course. Had that factored into
your testing?

> +		__percpu_counter_add(&mp->m_sb.sb_fdblocks, delta, batch);
> +		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
> +					   XFS_ALLOC_SET_ASIDE(mp)) >= 0) {
> +			/* we had space! */
> +			return 0;
>  		}
>  
> -		mp->m_sb.sb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
> -		return 0;
> +		/*
> +		 * lock up the sb for dipping into reserves before releasing
> +		 * the space that took us to ENOSPC.
> +		 */
> +		spin_lock(&mp->m_sb_lock);

Can you elaborate on the locking here, why it's needed where it wasn't
before?

Brian

> +		percpu_counter_add(&mp->m_sb.sb_fdblocks, -delta);
> +		if (!rsvd)
> +			goto fdblocks_enospc;
> +
> +		lcounter = (long long)mp->m_resblks_avail + delta;
> +		if (lcounter >= 0) {
> +			mp->m_resblks_avail = lcounter;
> +			spin_unlock(&mp->m_sb_lock);
> +			return 0;
> +		}
> +		printk_once(KERN_WARNING
> +			"Filesystem \"%s\": reserve blocks depleted! "
> +			"Consider increasing reserve pool size.",
> +			mp->m_fsname);
> +fdblocks_enospc:
> +		spin_unlock(&mp->m_sb_lock);
> +		return -ENOSPC;
> +
>  	case XFS_SBS_FREXTENTS:
>  		lcounter = (long long)mp->m_sb.sb_frextents;
>  		lcounter += delta;
> @@ -1286,11 +1320,14 @@ xfs_mod_incore_sb(
>  {
>  	int			status;
>  
> -#ifdef HAVE_PERCPU_SB
> -	ASSERT(field != XFS_SBS_FDBLOCKS);
> -#endif
> -	if (field == XFS_SBS_ICOUNT || field == XFS_SBS_IFREE)
> +	switch (field) {
> +	case XFS_SBS_ICOUNT:
> +	case XFS_SBS_IFREE:
> +	case XFS_SBS_FDBLOCKS:
>  		return xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
> +	default:
> +		break;
> +	}
>  
>  	spin_lock(&mp->m_sb_lock);
>  	status = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
> @@ -1309,7 +1346,7 @@ xfs_mod_incore_sb(
>   *
>   * Note that this function may not be used for the superblock values that
>   * are tracked with the in-memory per-cpu counters - a direct call to
> - * xfs_icsb_modify_counters is required for these.
> + * xfs_mod_incore_sb is required for these.
>   */
>  int
>  xfs_mod_incore_sb_batch(
> @@ -1494,7 +1531,6 @@ xfs_icsb_cpu_notify(
>  	case CPU_ONLINE:
>  	case CPU_ONLINE_FROZEN:
>  		xfs_icsb_lock(mp);
> -		xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
>  		xfs_icsb_unlock(mp);
>  		break;
>  	case CPU_DEAD:
> @@ -1504,13 +1540,9 @@ xfs_icsb_cpu_notify(
>  		 * re-enable the counters. */
>  		xfs_icsb_lock(mp);
>  		spin_lock(&mp->m_sb_lock);
> -		xfs_icsb_disable_counter(mp, XFS_SBS_FDBLOCKS);
> -
> -		mp->m_sb.sb_fdblocks += cntp->icsb_fdblocks;
>  
>  		memset(cntp, 0, sizeof(xfs_icsb_cnts_t));
>  
> -		xfs_icsb_balance_counter_locked(mp, XFS_SBS_FDBLOCKS, 0);
>  		spin_unlock(&mp->m_sb_lock);
>  		xfs_icsb_unlock(mp);
>  		break;
> @@ -1535,9 +1567,13 @@ xfs_icsb_init_counters(
>  	if (i)
>  		goto free_icount;
>  
> +	i = percpu_counter_init(&mp->m_sb.sb_fdblocks, 0, GFP_KERNEL);
> +	if (i)
> +		goto free_ifree;
> +
>  	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
>  	if (!mp->m_sb_cnts)
> -		goto free_ifree;
> +		goto free_fdblocks;
>  
>  	for_each_online_cpu(i) {
>  		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
> @@ -1560,6 +1596,8 @@ xfs_icsb_init_counters(
>  
>  	return 0;
>  
> +free_fdblocks:
> +	percpu_counter_destroy(&mp->m_sb.sb_fdblocks);
>  free_ifree:
>  	percpu_counter_destroy(&mp->m_sb.sb_ifree);
>  free_icount:
> @@ -1577,7 +1615,6 @@ xfs_icsb_reinit_counters(
>  	 * initial balance kicks us off correctly
>  	 */
>  	mp->m_icsb_counters = -1;
> -	xfs_icsb_balance_counter(mp, XFS_SBS_FDBLOCKS, 0);
>  	xfs_icsb_unlock(mp);
>  }
>  
> @@ -1592,6 +1629,7 @@ xfs_icsb_destroy_counters(
>  
>  	percpu_counter_destroy(&mp->m_sb.sb_icount);
>  	percpu_counter_destroy(&mp->m_sb.sb_ifree);
> +	percpu_counter_destroy(&mp->m_sb.sb_fdblocks);
>  
>  	mutex_destroy(&mp->m_icsb_mutex);
>  }
> @@ -1645,18 +1683,11 @@ xfs_icsb_count(
>  	xfs_icsb_cnts_t	*cnt,
>  	int		flags)
>  {
> -	xfs_icsb_cnts_t *cntp;
> -	int		i;
> -
>  	memset(cnt, 0, sizeof(xfs_icsb_cnts_t));
>  
>  	if (!(flags & XFS_ICSB_LAZY_COUNT))
>  		xfs_icsb_lock_all_counters(mp);
>  
> -	for_each_online_cpu(i) {
> -		cntp = (xfs_icsb_cnts_t *)per_cpu_ptr(mp->m_sb_cnts, i);
> -		cnt->icsb_fdblocks += cntp->icsb_fdblocks;
> -	}
>  
>  	if (!(flags & XFS_ICSB_LAZY_COUNT))
>  		xfs_icsb_unlock_all_counters(mp);
> @@ -1667,7 +1698,6 @@ xfs_icsb_counter_disabled(
>  	xfs_mount_t	*mp,
>  	xfs_sb_field_t	field)
>  {
> -	ASSERT(field == XFS_SBS_FDBLOCKS);
>  	return test_bit(field, &mp->m_icsb_counters);
>  }
>  
> @@ -1678,8 +1708,6 @@ xfs_icsb_disable_counter(
>  {
>  	xfs_icsb_cnts_t	cnt;
>  
> -	ASSERT(field == XFS_SBS_FDBLOCKS);
> -
>  	/*
>  	 * If we are already disabled, then there is nothing to do
>  	 * here. We check before locking all the counters to avoid
> @@ -1697,9 +1725,6 @@ xfs_icsb_disable_counter(
>  
>  		xfs_icsb_count(mp, &cnt, XFS_ICSB_LAZY_COUNT);
>  		switch(field) {
> -		case XFS_SBS_FDBLOCKS:
> -			mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
> -			break;
>  		default:
>  			BUG();
>  		}
> @@ -1715,18 +1740,11 @@ xfs_icsb_enable_counter(
>  	uint64_t	count,
>  	uint64_t	resid)
>  {
> -	xfs_icsb_cnts_t	*cntp;
>  	int		i;
>  
> -	ASSERT(field == XFS_SBS_FDBLOCKS);
> -
>  	xfs_icsb_lock_all_counters(mp);
>  	for_each_online_cpu(i) {
> -		cntp = per_cpu_ptr(mp->m_sb_cnts, i);
>  		switch (field) {
> -		case XFS_SBS_FDBLOCKS:
> -			cntp->icsb_fdblocks = count + resid;
> -			break;
>  		default:
>  			BUG();
>  			break;
> @@ -1745,9 +1763,6 @@ xfs_icsb_sync_counters_locked(
>  	xfs_icsb_cnts_t	cnt;
>  
>  	xfs_icsb_count(mp, &cnt, flags);
> -
> -	if (!xfs_icsb_counter_disabled(mp, XFS_SBS_FDBLOCKS))
> -		mp->m_sb.sb_fdblocks = cnt.icsb_fdblocks;
>  }
>  
>  /*
> @@ -1789,20 +1804,12 @@ xfs_icsb_balance_counter_locked(
>  	int		min_per_cpu)
>  {
>  	uint64_t	count, resid;
> -	int		weight = num_online_cpus();
> -	uint64_t	min = (uint64_t)min_per_cpu;
>  
>  	/* disable counter and sync counter */
>  	xfs_icsb_disable_counter(mp, field);
>  
>  	/* update counters  - first CPU gets residual*/
>  	switch (field) {
> -	case XFS_SBS_FDBLOCKS:
> -		count = mp->m_sb.sb_fdblocks;
> -		resid = do_div(count, weight);
> -		if (count < max(min, XFS_ICSB_FDBLK_CNTR_REENABLE(mp)))
> -			return;
> -		break;
>  	default:
>  		BUG();
>  		count = resid = 0;	/* quiet, gcc */
> @@ -1831,7 +1838,6 @@ xfs_icsb_modify_counters(
>  	int		rsvd)
>  {
>  	xfs_icsb_cnts_t	*icsbp;
> -	long long	lcounter;	/* long counter for 64 bit fields */
>  	int		ret = 0;
>  
>  	might_sleep();
> @@ -1851,18 +1857,9 @@ again:
>  	}
>  
>  	switch (field) {
> -	case XFS_SBS_FDBLOCKS:
> -		BUG_ON((mp->m_resblks - mp->m_resblks_avail) != 0);
> -
> -		lcounter = icsbp->icsb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
> -		lcounter += delta;
> -		if (unlikely(lcounter < 0))
> -			goto balance_counter;
> -		icsbp->icsb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
> -		break;
>  	default:
>  		BUG();
> -		break;
> +		goto balance_counter; /* be still, gcc */
>  	}
>  	xfs_icsb_unlock_cntr(icsbp);
>  	preempt_enable();
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index c17bfa4..0fa688a 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -1089,6 +1089,7 @@ xfs_fs_statfs(
>  	__uint64_t		fakeinos, id;
>  	__uint64_t		sb_icount;
>  	__uint64_t		sb_ifree;
> +	__uint64_t		sb_fdblocks;
>  	xfs_extlen_t		lsize;
>  	__int64_t		ffree;
>  
> @@ -1100,15 +1101,20 @@ xfs_fs_statfs(
>  	statp->f_fsid.val[1] = (u32)(id >> 32);
>  
>  	xfs_icsb_sync_counters(mp, XFS_ICSB_LAZY_COUNT);
> +
>  	sb_icount = percpu_counter_sum(&sbp->sb_icount);
>  	sb_ifree = percpu_counter_sum(&sbp->sb_ifree);
> +	sb_fdblocks = percpu_counter_sum(&sbp->sb_fdblocks);
>  
>  	spin_lock(&mp->m_sb_lock);
>  	statp->f_bsize = sbp->sb_blocksize;
>  	lsize = sbp->sb_logstart ? sbp->sb_logblocks : 0;
>  	statp->f_blocks = sbp->sb_dblocks - lsize;
> -	statp->f_bfree = statp->f_bavail =
> -				sbp->sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
> +	spin_unlock(&mp->m_sb_lock);
> +
> +	statp->f_bfree = sb_fdblocks - XFS_ALLOC_SET_ASIDE(mp);
> +	statp->f_bavail = statp->f_bfree;
> +
>  	fakeinos = statp->f_bfree << sbp->sb_inopblog;
>  	statp->f_files = MIN(sb_icount + fakeinos,
>  			     (__uint64_t)XFS_MAXINUMBER);
> @@ -1121,7 +1127,6 @@ xfs_fs_statfs(
>  	ffree = statp->f_files - (sb_icount - sb_ifree);
>  	statp->f_ffree = max_t(__int64_t, ffree, 0);
>  
> -	spin_unlock(&mp->m_sb_lock);
>  
>  	if ((ip->i_d.di_flags & XFS_DIFLAG_PROJINHERIT) &&
>  	    ((mp->m_qflags & (XFS_PQUOTA_ACCT|XFS_PQUOTA_ENFD))) ==
> diff --git a/fs/xfs/xfs_super.h b/fs/xfs/xfs_super.h
> index 6efc7a2..f649d1a 100644
> --- a/fs/xfs/xfs_super.h
> +++ b/fs/xfs/xfs_super.h
> @@ -99,8 +99,8 @@ struct xfs_sb {
>  
>  	struct percpu_counter	sb_icount;	/* allocated inodes */
>  	struct percpu_counter	sb_ifree;	/* free inodes */
> +	struct percpu_counter	sb_fdblocks;	/* free data blocks */
>  
> -	__uint64_t	sb_fdblocks;	/* free data blocks */
>  	__uint64_t	sb_frextents;	/* free realtime extents */
>  	xfs_ino_t	sb_uquotino;	/* user quota inode */
>  	xfs_ino_t	sb_gquotino;	/* group quota inode */
> diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
> index c54d4b7..b7da423 100644
> --- a/fs/xfs/xfs_trans.c
> +++ b/fs/xfs/xfs_trans.c
> @@ -184,7 +184,7 @@ xfs_trans_reserve(
>  	 * fail if the count would go below zero.
>  	 */
>  	if (blocks > 0) {
> -		error = xfs_icsb_modify_counters(tp->t_mountp, XFS_SBS_FDBLOCKS,
> +		error = xfs_mod_incore_sb(tp->t_mountp, XFS_SBS_FDBLOCKS,
>  					  -((int64_t)blocks), rsvd);
>  		if (error != 0) {
>  			current_restore_flags_nested(&tp->t_pflags, PF_FSTRANS);
> @@ -268,7 +268,7 @@ undo_log:
>  
>  undo_blocks:
>  	if (blocks > 0) {
> -		xfs_icsb_modify_counters(tp->t_mountp, XFS_SBS_FDBLOCKS,
> +		xfs_mod_incore_sb(tp->t_mountp, XFS_SBS_FDBLOCKS,
>  					 (int64_t)blocks, rsvd);
>  		tp->t_blk_res = 0;
>  	}
> @@ -547,8 +547,7 @@ xfs_trans_unreserve_and_mod_sb(
>  
>  	/* apply the per-cpu counters */
>  	if (blkdelta) {
> -		error = xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS,
> -						 blkdelta, rsvd);
> +		error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, blkdelta, rsvd);
>  		if (error)
>  			goto out;
>  	}
> @@ -635,7 +634,7 @@ out_undo_icount:
>  		xfs_mod_incore_sb(mp, XFS_SBS_ICOUNT, -idelta, rsvd);
>  out_undo_fdblocks:
>  	if (blkdelta)
> -		xfs_icsb_modify_counters(mp, XFS_SBS_FDBLOCKS, -blkdelta, rsvd);
> +		xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, -blkdelta, rsvd);
>  out:
>  	ASSERT(error == 0);
>  	return;
> -- 
> 2.0.0
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format
  2015-02-02  8:41   ` Christoph Hellwig
@ 2015-02-02 19:30     ` Dave Chinner
  2015-02-03 21:37       ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2015-02-02 19:30 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Mon, Feb 02, 2015 at 12:41:02AM -0800, Christoph Hellwig wrote:
> >  /*
> > - * Superblock - in core version.  Must match the ondisk version below.
> > - * Must be padded to 64 bit alignment.
> > - */
> > -typedef struct xfs_sb {
> > -	__uint32_t	sb_magicnum;	/* magic number == XFS_SB_MAGIC */
> > -	__uint32_t	sb_blocksize;	/* logical block size, bytes */
> 
> > -static inline int xfs_sb_version_hasfinobt(xfs_sb_t *sbp)
> > +static inline int xfs_sb_version_hasfinobt(struct xfs_sb *sbp)
> 
> So xfs_format.h now requires struct xfs_sb to be defined before it
> can be included?  I guess we need to move these macros around as well.

Yes, that's why I moved it to xfs_super.h, which is included from
xfs_linux.h. I just wanted to move it from xfs_format.h to
/somewhere/ to demonstrate that it isn't part of the on-disk format
anymore.

> > diff --git a/fs/xfs/xfs_super.h b/fs/xfs/xfs_super.h
> > index 2b830c2..a02236b 100644
> > --- a/fs/xfs/xfs_super.h
> > +++ b/fs/xfs/xfs_super.h
> > @@ -61,6 +61,87 @@ struct xfs_mount;
> >  struct xfs_buftarg;
> >  struct block_device;
> >  
> > +/*
> > + * Superblock - in core version.  This does not have ot match the size and shape
> > + * of the on-disk superblock, but must contain all the fields that we use in the
> > + * on-disk superblock.
> > + */
> > +struct xfs_sb {
> 
> Is this really the right header?  xfs_super.h only really is for bits
> related to linux super block operastions.

No, probably not, but it was expedient and good enough for an RFC
and sanity testing. I thought about xfs_mount.h, but then realised
that just introduces a new dependency between xfs_mount.h and
libxfs, which is something I'm trying to reduce.

> I'd expect to move it close to stuct xfs_mount, and maybe even merge
> it into that in the long run.

I guess moving the structure there is fine, but we still want all
the version functions to be shared with userspace, which then makes
for an interesting set of dependencies. Any other ideas?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] xfs: use generic percpu counters for inode counter
  2015-02-02 16:44   ` Christoph Hellwig
@ 2015-02-02 19:33     ` Dave Chinner
  2015-02-03 21:38       ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2015-02-02 19:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Mon, Feb 02, 2015 at 08:44:09AM -0800, Christoph Hellwig wrote:
> > diff --git a/fs/xfs/libxfs/xfs_sb.c b/fs/xfs/libxfs/xfs_sb.c
> > index 4cf335b..7bfa527 100644
> > --- a/fs/xfs/libxfs/xfs_sb.c
> > +++ b/fs/xfs/libxfs/xfs_sb.c
> > @@ -357,7 +357,8 @@ __xfs_sb_from_disk(
> >  	to->sb_rextslog = from->sb_rextslog;
> >  	to->sb_inprogress = from->sb_inprogress;
> >  	to->sb_imax_pct = from->sb_imax_pct;
> > -	to->sb_icount = be64_to_cpu(from->sb_icount);
> > +	if (percpu_counter_initialized(&to->sb_icount))
> > +		percpu_counter_set(&to->sb_icount, be64_to_cpu(from->sb_icount));
> 
> Why would the percpu counter not be initialized here?  Oh, I guess
> this is for xfs_sb_verify().  But why can't xfs_mount_validate_sb simply
> operate on the disk endian SB to avoid that whole issue?

Possibly. I'll look into it.

> > @@ -1288,8 +1288,11 @@ xfs_mod_incore_sb(
> >  	int			status;
> >  
> >  #ifdef HAVE_PERCPU_SB
> > -	ASSERT(field < XFS_SBS_ICOUNT || field > XFS_SBS_FDBLOCKS);
> > +	ASSERT(field < XFS_SBS_IFREE || field > XFS_SBS_FDBLOCKS);
> >  #endif
> > +	if (field == XFS_SBS_ICOUNT)
> > +		return xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
> > +
> 
> Why is this multiplexd through xfs_mod_incore_sb_unlocked while needing
> a different locking context?  Shouldn't we simply use a different helper
> for this case?

Again, expedient. To fix, I need to export
xfs_mod_incore_sb_unlocked().

> >  	xfs_icsb_cnts_t *cntp;
> >  	int		i;
> >  
> > +	i = percpu_counter_init(&mp->m_sb.sb_icount, 0, GFP_KERNEL);
> > +	if (i)
> > +		return ENOMEM;
> > +
> >  	mp->m_sb_cnts = alloc_percpu(xfs_icsb_cnts_t);
> > -	if (mp->m_sb_cnts == NULL)
> > +	if (!mp->m_sb_cnts) {
> > +		percpu_counter_destroy(&mp->m_sb.sb_icount);
> >  		return -ENOMEM;
> > +	}
> >  
> >  	for_each_online_cpu(i) {
> 
> Reusing a variable for both an errno value and a loop iterator is
> not very readable, just add an additional "error" variabe.

In the end it gets renamed to error. I'll fix it up.

> Also percpu_counter_init returns a proper egative errno value, no need
> to turn that into the incorrect postive ENOMEM.

Oversight. Will fix.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] xfs: use generic percpu counters for free block counter
  2015-02-02 16:48   ` Christoph Hellwig
@ 2015-02-02 19:34     ` Dave Chinner
  0 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2015-02-02 19:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Mon, Feb 02, 2015 at 08:48:47AM -0800, Christoph Hellwig wrote:
> 
> This screams for having two different helpers for removing vs adding back
> reserved blocks.

Yup, that it does.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 4/5] xfs: use generic percpu counters for free block counter
  2015-02-02 17:11   ` Brian Foster
@ 2015-02-02 19:39     ` Dave Chinner
  0 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2015-02-02 19:39 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Mon, Feb 02, 2015 at 12:11:33PM -0500, Brian Foster wrote:
> On Mon, Feb 02, 2015 at 08:43:02AM +1100, Dave Chinner wrote:
> > +		/*
> > +		 * Taking blocks away, need to be more accurate the closer we
> > +		 * are to zero.
> > +		 *
> > +		 * batch size is set to a maximum of 1024 blocks - if we are
> > +		 * allocating of freeing extents larger than this then we aren't
> > +		 * going to be hammering the counter lock so a lock per update
> > +		 * is not a problem.
> > +		 *
> 
> IIUC, the batch size determines the point at which the local cpu delta
> is folded back into the global counter (under lock). If we're allocating
> large extents, these will surpass the batch size and result in a locked
> update. Smaller updates are aggregated into the cpu counter and folded
> in at some later time.

Right.

> > +		 * If the counter has a value of less than 2 * max batch size,
> > +		 * then make everything serialise as we are real close to
> > +		 * ENOSPC.
> > +		 */
> > +#define __BATCH	1024
> > +		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
> > +					   2 * __BATCH) < 0)
> > +			batch = 1;
> > +		else
> > +			batch = __BATCH;
> > +
> 
> The general approach seems logical. I do wonder whether blocks is the
> right scale as opposed to block count normalized against some fixed I/O
> size (to account for different block sizes).

We allocate in blocks, so the IO size is really irrelevant. The
scalability issue at hand is page-by-page space reservation during
delayed allocation, so really the block size makes less difference
to performance than the page size....

> Also, it seems like speculative preallocation could hide some of the
> overhead here, depending on workload of course. Had that factored into
> your testing?

Yes, somewhat, though I shuld do some testing using 4k direct IO and
buffered IO with allocsize set appropriately.

> 
> > +		__percpu_counter_add(&mp->m_sb.sb_fdblocks, delta, batch);
> > +		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
> > +					   XFS_ALLOC_SET_ASIDE(mp)) >= 0) {
> > +			/* we had space! */
> > +			return 0;
> >  		}
> >  
> > -		mp->m_sb.sb_fdblocks = lcounter + XFS_ALLOC_SET_ASIDE(mp);
> > -		return 0;
> > +		/*
> > +		 * lock up the sb for dipping into reserves before releasing
> > +		 * the space that took us to ENOSPC.
> > +		 */
> > +		spin_lock(&mp->m_sb_lock);
> 
> Can you elaborate on the locking here, why it's needed where it wasn't
> before?

The lock protects the reserved pool. And it was used before as the
only time we called into this function was with the m_sb_lock held.
this is a bit of a hack because we now call into the function
without the lock held....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format
  2015-02-02 19:30     ` Dave Chinner
@ 2015-02-03 21:37       ` Christoph Hellwig
  2015-02-03 21:46         ` Dave Chinner
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2015-02-03 21:37 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs

On Tue, Feb 03, 2015 at 06:30:21AM +1100, Dave Chinner wrote:
> > I'd expect to move it close to stuct xfs_mount, and maybe even merge
> > it into that in the long run.
> 
> I guess moving the structure there is fine, but we still want all
> the version functions to be shared with userspace, which then makes
> for an interesting set of dependencies. Any other ideas?

Are they really worth the sharing?  If they are worth it we'll
need somethign that can expect a xfs_sb/xfs_mount to be defined.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 2/5] xfs: use generic percpu counters for inode counter
  2015-02-02 19:33     ` Dave Chinner
@ 2015-02-03 21:38       ` Christoph Hellwig
  0 siblings, 0 replies; 24+ messages in thread
From: Christoph Hellwig @ 2015-02-03 21:38 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs

On Tue, Feb 03, 2015 at 06:33:44AM +1100, Dave Chinner wrote:
> > Why would the percpu counter not be initialized here?  Oh, I guess
> > this is for xfs_sb_verify().  But why can't xfs_mount_validate_sb simply
> > operate on the disk endian SB to avoid that whole issue?
> 
> Possibly. I'll look into it.

A simpler alternative would be to move them out of __xfs_sb_from_disk
into the one caller that needs these values.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format
  2015-02-03 21:37       ` Christoph Hellwig
@ 2015-02-03 21:46         ` Dave Chinner
  2015-02-03 23:34           ` Dave Chinner
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2015-02-03 21:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Tue, Feb 03, 2015 at 01:37:44PM -0800, Christoph Hellwig wrote:
> On Tue, Feb 03, 2015 at 06:30:21AM +1100, Dave Chinner wrote:
> > > I'd expect to move it close to stuct xfs_mount, and maybe even merge
> > > it into that in the long run.
> > 
> > I guess moving the structure there is fine, but we still want all
> > the version functions to be shared with userspace, which then makes
> > for an interesting set of dependencies. Any other ideas?
> 
> Are they really worth the sharing?  If they are worth it we'll
> need somethign that can expect a xfs_sb/xfs_mount to be defined.

I suppose we could stop sharing them - they change rarely enough
and it's only a few lines of code for each new feature that would
then need to be duplicated. Not a huge burden...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 0/5] xfs: use generic percpu counters for icsb
  2015-02-01 21:42 [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Dave Chinner
                   ` (4 preceding siblings ...)
  2015-02-01 21:43 ` [PATCH 5/5] xfs: Remove icsb infrastructure Dave Chinner
@ 2015-02-03 21:50 ` Christoph Hellwig
  2015-02-03 21:58   ` Dave Chinner
  5 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2015-02-03 21:50 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

FYI, I think we should just get rid of the horrible xfs_mod_incore_sb(_batch)
interface as part of this.  The patch below applies on top of your
series.

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index ac4d64e..a45e929b 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -2212,9 +2212,8 @@ xfs_bmap_add_extent_delay_real(
 		diff = (int)(temp + temp2 - startblockval(PREV.br_startblock) -
 			(bma->cur ? bma->cur->bc_private.b.allocated : 0));
 		if (diff > 0) {
-			error = xfs_mod_incore_sb(bma->ip->i_mount,
-					XFS_SBS_FDBLOCKS,
-					-((int64_t)diff), 0);
+			error = xfs_sb_mod_fdblocks(bma->ip->i_mount,
+					-((int64_t)diff), false);
 			ASSERT(!error);
 			if (error)
 				goto done;
@@ -2265,9 +2264,8 @@ xfs_bmap_add_extent_delay_real(
 			temp += bma->cur->bc_private.b.allocated;
 		ASSERT(temp <= da_old);
 		if (temp < da_old)
-			xfs_mod_incore_sb(bma->ip->i_mount,
-					XFS_SBS_FDBLOCKS,
-					(int64_t)(da_old - temp), 0);
+			xfs_sb_mod_fdblocks(bma->ip->i_mount,
+					(int64_t)(da_old - temp), false);
 	}
 
 	/* clear out the allocated field, done with it now in any case. */
@@ -2944,8 +2942,8 @@ xfs_bmap_add_extent_hole_delay(
 	}
 	if (oldlen != newlen) {
 		ASSERT(oldlen > newlen);
-		xfs_mod_incore_sb(ip->i_mount, XFS_SBS_FDBLOCKS,
-			(int64_t)(oldlen - newlen), 0);
+		xfs_sb_mod_fdblocks(ip->i_mount, (int64_t)(oldlen - newlen),
+					false);
 		/*
 		 * Nothing to do for disk quota accounting here.
 		 */
@@ -4159,19 +4157,15 @@ xfs_bmapi_reserve_delalloc(
 	indlen = (xfs_extlen_t)xfs_bmap_worst_indlen(ip, alen);
 	ASSERT(indlen > 0);
 
-	if (rt) {
-		error = xfs_mod_incore_sb(mp, XFS_SBS_FREXTENTS,
-					  -((int64_t)extsz), 0);
-	} else {
-		error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS,
-						 -((int64_t)alen), 0);
-	}
+	if (rt)
+		error = xfs_sb_mod_frextents(mp, -((int64_t)extsz));
+	else
+		error = xfs_sb_mod_fdblocks(mp, -((int64_t)alen), false);
 
 	if (error)
 		goto out_unreserve_quota;
 
-	error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS,
-					 -((int64_t)indlen), 0);
+	error = xfs_sb_mod_fdblocks(mp, -((int64_t)indlen), false);
 	if (error)
 		goto out_unreserve_blocks;
 
@@ -4198,9 +4192,9 @@ xfs_bmapi_reserve_delalloc(
 
 out_unreserve_blocks:
 	if (rt)
-		xfs_mod_incore_sb(mp, XFS_SBS_FREXTENTS, extsz, 0);
+		xfs_sb_mod_frextents(mp, extsz);
 	else
-		xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, alen, 0);
+		xfs_sb_mod_fdblocks(mp, alen, false);
 out_unreserve_quota:
 	if (XFS_IS_QUOTA_ON(mp))
 		xfs_trans_unreserve_quota_nblks(NULL, ip, (long)alen, 0, rt ?
@@ -5012,10 +5006,8 @@ xfs_bmap_del_extent(
 	 * Nothing to do for disk quota accounting here.
 	 */
 	ASSERT(da_old >= da_new);
-	if (da_old > da_new) {
-		xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS,
-			(int64_t)(da_old - da_new), 0);
-	}
+	if (da_old > da_new)
+		xfs_sb_mod_fdblocks(mp, (int64_t)(da_old - da_new), false);
 done:
 	*logflagsp = flags;
 	return error;
@@ -5284,14 +5276,13 @@ xfs_bunmapi(
 
 				rtexts = XFS_FSB_TO_B(mp, del.br_blockcount);
 				do_div(rtexts, mp->m_sb.sb_rextsize);
-				xfs_mod_incore_sb(mp, XFS_SBS_FREXTENTS,
-						(int64_t)rtexts, 0);
+				xfs_sb_mod_frextents(mp, (int64_t)rtexts);
 				(void)xfs_trans_reserve_quota_nblks(NULL,
 					ip, -((long)del.br_blockcount), 0,
 					XFS_QMOPT_RES_RTBLKS);
 			} else {
-				xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS,
-						(int64_t)del.br_blockcount, 0);
+				xfs_sb_mod_fdblocks(mp,
+					(int64_t)del.br_blockcount, false);
 				(void)xfs_trans_reserve_quota_nblks(NULL,
 					ip, -((long)del.br_blockcount), 0,
 					XFS_QMOPT_RES_REGBLKS);
diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 2c8d11f..2a6dfac 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -175,69 +175,6 @@ typedef struct xfs_dsb {
 } xfs_dsb_t;
 
 /*
- * Sequence number values for the fields.
- */
-typedef enum {
-	XFS_SBS_MAGICNUM, XFS_SBS_BLOCKSIZE, XFS_SBS_DBLOCKS, XFS_SBS_RBLOCKS,
-	XFS_SBS_REXTENTS, XFS_SBS_UUID, XFS_SBS_LOGSTART, XFS_SBS_ROOTINO,
-	XFS_SBS_RBMINO, XFS_SBS_RSUMINO, XFS_SBS_REXTSIZE, XFS_SBS_AGBLOCKS,
-	XFS_SBS_AGCOUNT, XFS_SBS_RBMBLOCKS, XFS_SBS_LOGBLOCKS,
-	XFS_SBS_VERSIONNUM, XFS_SBS_SECTSIZE, XFS_SBS_INODESIZE,
-	XFS_SBS_INOPBLOCK, XFS_SBS_FNAME, XFS_SBS_BLOCKLOG,
-	XFS_SBS_SECTLOG, XFS_SBS_INODELOG, XFS_SBS_INOPBLOG, XFS_SBS_AGBLKLOG,
-	XFS_SBS_REXTSLOG, XFS_SBS_INPROGRESS, XFS_SBS_IMAX_PCT, XFS_SBS_ICOUNT,
-	XFS_SBS_IFREE, XFS_SBS_FDBLOCKS, XFS_SBS_FREXTENTS, XFS_SBS_UQUOTINO,
-	XFS_SBS_GQUOTINO, XFS_SBS_QFLAGS, XFS_SBS_FLAGS, XFS_SBS_SHARED_VN,
-	XFS_SBS_INOALIGNMT, XFS_SBS_UNIT, XFS_SBS_WIDTH, XFS_SBS_DIRBLKLOG,
-	XFS_SBS_LOGSECTLOG, XFS_SBS_LOGSECTSIZE, XFS_SBS_LOGSUNIT,
-	XFS_SBS_FEATURES2, XFS_SBS_BAD_FEATURES2, XFS_SBS_FEATURES_COMPAT,
-	XFS_SBS_FEATURES_RO_COMPAT, XFS_SBS_FEATURES_INCOMPAT,
-	XFS_SBS_FEATURES_LOG_INCOMPAT, XFS_SBS_CRC, XFS_SBS_PAD,
-	XFS_SBS_PQUOTINO, XFS_SBS_LSN,
-	XFS_SBS_FIELDCOUNT
-} xfs_sb_field_t;
-
-/*
- * Mask values, defined based on the xfs_sb_field_t values.
- * Only define the ones we're using.
- */
-#define	XFS_SB_MVAL(x)		(1LL << XFS_SBS_ ## x)
-#define	XFS_SB_UUID		XFS_SB_MVAL(UUID)
-#define	XFS_SB_FNAME		XFS_SB_MVAL(FNAME)
-#define	XFS_SB_ROOTINO		XFS_SB_MVAL(ROOTINO)
-#define	XFS_SB_RBMINO		XFS_SB_MVAL(RBMINO)
-#define	XFS_SB_RSUMINO		XFS_SB_MVAL(RSUMINO)
-#define	XFS_SB_VERSIONNUM	XFS_SB_MVAL(VERSIONNUM)
-#define XFS_SB_UQUOTINO		XFS_SB_MVAL(UQUOTINO)
-#define XFS_SB_GQUOTINO		XFS_SB_MVAL(GQUOTINO)
-#define XFS_SB_QFLAGS		XFS_SB_MVAL(QFLAGS)
-#define XFS_SB_SHARED_VN	XFS_SB_MVAL(SHARED_VN)
-#define XFS_SB_UNIT		XFS_SB_MVAL(UNIT)
-#define XFS_SB_WIDTH		XFS_SB_MVAL(WIDTH)
-#define XFS_SB_ICOUNT		XFS_SB_MVAL(ICOUNT)
-#define XFS_SB_IFREE		XFS_SB_MVAL(IFREE)
-#define XFS_SB_FDBLOCKS		XFS_SB_MVAL(FDBLOCKS)
-#define XFS_SB_FEATURES2	(XFS_SB_MVAL(FEATURES2) | \
-				 XFS_SB_MVAL(BAD_FEATURES2))
-#define XFS_SB_FEATURES_COMPAT	XFS_SB_MVAL(FEATURES_COMPAT)
-#define XFS_SB_FEATURES_RO_COMPAT XFS_SB_MVAL(FEATURES_RO_COMPAT)
-#define XFS_SB_FEATURES_INCOMPAT XFS_SB_MVAL(FEATURES_INCOMPAT)
-#define XFS_SB_FEATURES_LOG_INCOMPAT XFS_SB_MVAL(FEATURES_LOG_INCOMPAT)
-#define XFS_SB_CRC		XFS_SB_MVAL(CRC)
-#define XFS_SB_PQUOTINO		XFS_SB_MVAL(PQUOTINO)
-#define	XFS_SB_NUM_BITS		((int)XFS_SBS_FIELDCOUNT)
-#define	XFS_SB_ALL_BITS		((1LL << XFS_SB_NUM_BITS) - 1)
-#define	XFS_SB_MOD_BITS		\
-	(XFS_SB_UUID | XFS_SB_ROOTINO | XFS_SB_RBMINO | XFS_SB_RSUMINO | \
-	 XFS_SB_VERSIONNUM | XFS_SB_UQUOTINO | XFS_SB_GQUOTINO | \
-	 XFS_SB_QFLAGS | XFS_SB_SHARED_VN | XFS_SB_UNIT | XFS_SB_WIDTH | \
-	 XFS_SB_ICOUNT | XFS_SB_IFREE | XFS_SB_FDBLOCKS | XFS_SB_FEATURES2 | \
-	 XFS_SB_FEATURES_COMPAT | XFS_SB_FEATURES_RO_COMPAT | \
-	 XFS_SB_FEATURES_INCOMPAT | XFS_SB_FEATURES_LOG_INCOMPAT | \
-	 XFS_SB_PQUOTINO)
-
-
-/*
  * Misc. Flags - warning - these will be cleared by xfs_repair unless
  * a feature bit is set when the flag is used.
  */
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 28389e0..be2d795e 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -749,7 +749,7 @@ out:
 		 * the extra reserve blocks from the reserve.....
 		 */
 		int error;
-		error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, fdblks_delta, 0);
+		error = xfs_sb_mod_fdblocks(mp, fdblks_delta, false);
 		if (error == -ENOSPC)
 			goto retry;
 	}
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 07498f0..8a6bbf4 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1082,292 +1082,135 @@ xfs_log_sbcount(xfs_mount_t *mp)
 	return xfs_sync_sb(mp, true);
 }
 
-/*
- * xfs_mod_incore_sb_unlocked() is a utility routine commonly used to apply
- * a delta to a specified field in the in-core superblock.  Simply
- * switch on the field indicated and apply the delta to that field.
- * Fields are not allowed to dip below zero, so if the delta would
- * do this do not apply it and return EINVAL.
- *
- * The m_sb_lock must be held when this routine is called.
- */
-STATIC int
-xfs_mod_incore_sb_unlocked(
-	xfs_mount_t	*mp,
-	xfs_sb_field_t	field,
-	int64_t		delta,
-	int		rsvd)
+int
+xfs_sb_mod_fdblocks(
+	struct xfs_mount	*mp,
+	int64_t			delta,
+	bool			rsvd)
 {
-	int		scounter;	/* short counter for 32 bit fields */
-	long long	lcounter;	/* long counter for 64 bit fields */
-	long long	res_used;
-	s32		batch;
+	int64_t			lcounter;
+	long long		res_used;
+	s32			batch;
 
-	/*
-	 * With the in-core superblock spin lock held, switch
-	 * on the indicated field.  Apply the delta to the
-	 * proper field.  If the fields value would dip below
-	 * 0, then do not apply the delta and return EINVAL.
-	 */
-	switch (field) {
-	case XFS_SBS_ICOUNT:
-		/* deltas are +/-64, hence the large batch size of 128. */
-		__percpu_counter_add(&mp->m_sb.sb_icount, delta, 128);
-		if (percpu_counter_compare(&mp->m_sb.sb_icount, 0) < 0) {
-			ASSERT(0);
-			percpu_counter_add(&mp->m_sb.sb_icount, -delta);
-			return -EINVAL;
-		}
-		return 0;
-	case XFS_SBS_IFREE:
-		percpu_counter_add(&mp->m_sb.sb_ifree, delta);
-		if (percpu_counter_compare(&mp->m_sb.sb_ifree, 0) < 0) {
-			ASSERT(0);
-			percpu_counter_add(&mp->m_sb.sb_ifree, -delta);
-			return -EINVAL;
+	if (delta > 0) {		/* Putting blocks back */
+		if (mp->m_resblks == mp->m_resblks_avail) {
+			percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
+			return 0;
 		}
-		return 0;
-	case XFS_SBS_FDBLOCKS:
-
-		if (delta > 0) {		/* Putting blocks back */
-			if (mp->m_resblks == mp->m_resblks_avail) {
-				percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
-				return 0;
-			}
 
-			/* put blocks back into reserve pool first */
-			spin_lock(&mp->m_sb_lock);
-			res_used = (long long)
-					(mp->m_resblks - mp->m_resblks_avail);
+		/* put blocks back into reserve pool first */
+		spin_lock(&mp->m_sb_lock);
+		res_used = (long long)
+				(mp->m_resblks - mp->m_resblks_avail);
 
-			if (res_used > delta) {
-				mp->m_resblks_avail += delta;
-			} else {
-				delta -= res_used;
+		if (res_used > delta) {
+			mp->m_resblks_avail += delta;
+		} else {
+			delta -= res_used;
 				mp->m_resblks_avail = mp->m_resblks;
-				percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
-			}
-			spin_unlock(&mp->m_sb_lock);
-			return 0;
-
+			percpu_counter_add(&mp->m_sb.sb_fdblocks, delta);
 		}
+		spin_unlock(&mp->m_sb_lock);
+		return 0;
 
-		/*
-		 * Taking blocks away, need to be more accurate the closer we
-		 * are to zero.
-		 *
-		 * batch size is set to a maximum of 1024 blocks - if we are
-		 * allocating of freeing extents larger than this then we aren't
-		 * going to be hammering the counter lock so a lock per update
-		 * is not a problem.
-		 *
-		 * If the counter has a value of less than 2 * max batch size,
-		 * then make everything serialise as we are real close to
-		 * ENOSPC.
-		 */
-#define __BATCH	1024
-		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
-					   2 * __BATCH) < 0)
-			batch = 1;
-		else
-			batch = __BATCH;
-
-		__percpu_counter_add(&mp->m_sb.sb_fdblocks, delta, batch);
-		if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
-					   XFS_ALLOC_SET_ASIDE(mp)) >= 0) {
-			/* we had space! */
-			return 0;
-		}
+	}
 
-		/*
-		 * lock up the sb for dipping into reserves before releasing
-		 * the space that took us to ENOSPC.
-		 */
-		spin_lock(&mp->m_sb_lock);
-		percpu_counter_add(&mp->m_sb.sb_fdblocks, -delta);
-		if (!rsvd)
-			goto fdblocks_enospc;
-
-		lcounter = (long long)mp->m_resblks_avail + delta;
-		if (lcounter >= 0) {
-			mp->m_resblks_avail = lcounter;
-			spin_unlock(&mp->m_sb_lock);
-			return 0;
-		}
-		printk_once(KERN_WARNING
-			"Filesystem \"%s\": reserve blocks depleted! "
-			"Consider increasing reserve pool size.",
-			mp->m_fsname);
-fdblocks_enospc:
-		spin_unlock(&mp->m_sb_lock);
-		return -ENOSPC;
+	/*
+	 * Taking blocks away, need to be more accurate the closer we
+	 * are to zero.
+	 *
+	 * batch size is set to a maximum of 1024 blocks - if we are
+	 * allocating of freeing extents larger than this then we aren't
+	 * going to be hammering the counter lock so a lock per update
+	 * is not a problem.
+	 *
+	 * If the counter has a value of less than 2 * max batch size,
+	 * then make everything serialise as we are real close to
+	 * ENOSPC.
+	 */
+#define __BATCH	1024
+	if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
+				   2 * __BATCH) < 0)
+		batch = 1;
+	else
+		batch = __BATCH;
 
-	case XFS_SBS_FREXTENTS:
-		lcounter = (long long)mp->m_sb.sb_frextents;
-		lcounter += delta;
-		if (lcounter < 0) {
-			return -ENOSPC;
-		}
-		mp->m_sb.sb_frextents = lcounter;
-		return 0;
-	case XFS_SBS_DBLOCKS:
-		lcounter = (long long)mp->m_sb.sb_dblocks;
-		lcounter += delta;
-		if (lcounter < 0) {
-			ASSERT(0);
-			return -EINVAL;
-		}
-		mp->m_sb.sb_dblocks = lcounter;
-		return 0;
-	case XFS_SBS_AGCOUNT:
-		scounter = mp->m_sb.sb_agcount;
-		scounter += delta;
-		if (scounter < 0) {
-			ASSERT(0);
-			return -EINVAL;
-		}
-		mp->m_sb.sb_agcount = scounter;
-		return 0;
-	case XFS_SBS_IMAX_PCT:
-		scounter = mp->m_sb.sb_imax_pct;
-		scounter += delta;
-		if (scounter < 0) {
-			ASSERT(0);
-			return -EINVAL;
-		}
-		mp->m_sb.sb_imax_pct = scounter;
+	__percpu_counter_add(&mp->m_sb.sb_fdblocks, delta, batch);
+	if (percpu_counter_compare(&mp->m_sb.sb_fdblocks,
+				   XFS_ALLOC_SET_ASIDE(mp)) >= 0) {
+		/* we had space! */
 		return 0;
-	case XFS_SBS_REXTSIZE:
-		scounter = mp->m_sb.sb_rextsize;
-		scounter += delta;
-		if (scounter < 0) {
-			ASSERT(0);
-			return -EINVAL;
-		}
-		mp->m_sb.sb_rextsize = scounter;
-		return 0;
-	case XFS_SBS_RBMBLOCKS:
-		scounter = mp->m_sb.sb_rbmblocks;
-		scounter += delta;
-		if (scounter < 0) {
-			ASSERT(0);
-			return -EINVAL;
-		}
-		mp->m_sb.sb_rbmblocks = scounter;
-		return 0;
-	case XFS_SBS_RBLOCKS:
-		lcounter = (long long)mp->m_sb.sb_rblocks;
-		lcounter += delta;
-		if (lcounter < 0) {
-			ASSERT(0);
-			return -EINVAL;
-		}
-		mp->m_sb.sb_rblocks = lcounter;
-		return 0;
-	case XFS_SBS_REXTENTS:
-		lcounter = (long long)mp->m_sb.sb_rextents;
-		lcounter += delta;
-		if (lcounter < 0) {
-			ASSERT(0);
-			return -EINVAL;
-		}
-		mp->m_sb.sb_rextents = lcounter;
-		return 0;
-	case XFS_SBS_REXTSLOG:
-		scounter = mp->m_sb.sb_rextslog;
-		scounter += delta;
-		if (scounter < 0) {
-			ASSERT(0);
-			return -EINVAL;
-		}
-		mp->m_sb.sb_rextslog = scounter;
+	}
+
+	/*
+	 * lock up the sb for dipping into reserves before releasing
+	 * the space that took us to ENOSPC.
+	 */
+	spin_lock(&mp->m_sb_lock);
+	percpu_counter_add(&mp->m_sb.sb_fdblocks, -delta);
+	if (!rsvd)
+		goto fdblocks_enospc;
+
+	lcounter = (long long)mp->m_resblks_avail + delta;
+	if (lcounter >= 0) {
+		mp->m_resblks_avail = lcounter;
+		spin_unlock(&mp->m_sb_lock);
 		return 0;
-	default:
-		ASSERT(0);
-		return -EINVAL;
 	}
+	printk_once(KERN_WARNING
+		"Filesystem \"%s\": reserve blocks depleted! "
+		"Consider increasing reserve pool size.",
+		mp->m_fsname);
+fdblocks_enospc:
+	spin_unlock(&mp->m_sb_lock);
+	return -ENOSPC;
 }
 
-/*
- * xfs_mod_incore_sb() is used to change a field in the in-core
- * superblock structure by the specified delta.  This modification
- * is protected by the m_sb_lock.  Just use the xfs_mod_incore_sb_unlocked()
- * routine to do the work.
- */
 int
-xfs_mod_incore_sb(
+xfs_sb_mod_frextents(
 	struct xfs_mount	*mp,
-	xfs_sb_field_t		field,
-	int64_t			delta,
-	int			rsvd)
+	int64_t			delta)
 {
-	int			status;
-
-	switch (field) {
-	case XFS_SBS_ICOUNT:
-	case XFS_SBS_IFREE:
-	case XFS_SBS_FDBLOCKS:
-		return xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
-	default:
-		break;
-	}
+	int64_t			lcounter = mp->m_sb.sb_frextents;
+	int			ret = 0;
 
 	spin_lock(&mp->m_sb_lock);
-	status = xfs_mod_incore_sb_unlocked(mp, field, delta, rsvd);
+	lcounter += delta;
+	if (lcounter < 0)
+		ret = -ENOSPC;
+	else
+		mp->m_sb.sb_frextents = lcounter;
 	spin_unlock(&mp->m_sb_lock);
-
-	return status;
+	return ret;
 }
 
-/*
- * Change more than one field in the in-core superblock structure at a time.
- *
- * The fields and changes to those fields are specified in the array of
- * xfs_mod_sb structures passed in.  Either all of the specified deltas
- * will be applied or none of them will.  If any modified field dips below 0,
- * then all modifications will be backed out and EINVAL will be returned.
- *
- * Note that this function may not be used for the superblock values that
- * are tracked with the in-memory per-cpu counters - a direct call to
- * xfs_mod_incore_sb is required for these.
- */
 int
-xfs_mod_incore_sb_batch(
+xfs_sb_mod_icount(
 	struct xfs_mount	*mp,
-	xfs_mod_sb_t		*msb,
-	uint			nmsb,
-	int			rsvd)
+	int64_t			delta)
 {
-	xfs_mod_sb_t		*msbp;
-	int			error = 0;
-
-	/*
-	 * Loop through the array of mod structures and apply each individually.
-	 * If any fail, then back out all those which have already been applied.
-	 * Do all of this within the scope of the m_sb_lock so that all of the
-	 * changes will be atomic.
-	 */
-	spin_lock(&mp->m_sb_lock);
-	for (msbp = msb; msbp < (msb + nmsb); msbp++) {
-		ASSERT(msbp->msb_field < XFS_SBS_ICOUNT ||
-		       msbp->msb_field > XFS_SBS_FDBLOCKS);
-
-		error = xfs_mod_incore_sb_unlocked(mp, msbp->msb_field,
-						   msbp->msb_delta, rsvd);
-		if (error)
-			goto unwind;
+	/* deltas are +/-64, hence the large batch size of 128. */
+	__percpu_counter_add(&mp->m_sb.sb_icount, delta, 128);
+	if (percpu_counter_compare(&mp->m_sb.sb_icount, 0) < 0) {
+		ASSERT(0);
+		percpu_counter_add(&mp->m_sb.sb_icount, -delta);
+		return -EINVAL;
 	}
-	spin_unlock(&mp->m_sb_lock);
 	return 0;
+}
 
-unwind:
-	while (--msbp >= msb) {
-		error = xfs_mod_incore_sb_unlocked(mp, msbp->msb_field,
-						   -msbp->msb_delta, rsvd);
-		ASSERT(error == 0);
+int
+xfs_sb_mod_ifree(
+	struct xfs_mount	*mp,
+	int64_t			delta)
+{
+	percpu_counter_add(&mp->m_sb.sb_ifree, delta);
+	if (percpu_counter_compare(&mp->m_sb.sb_ifree, 0) < 0) {
+		ASSERT(0);
+		percpu_counter_add(&mp->m_sb.sb_ifree, -delta);
+		return -EINVAL;
 	}
-	spin_unlock(&mp->m_sb_lock);
-	return error;
+	return 0;
 }
 
 /*
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index 4e22e96..9bb06eb 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -249,15 +249,6 @@ xfs_daddr_to_agbno(struct xfs_mount *mp, xfs_daddr_t d)
 }
 
 /*
- * This structure is for use by the xfs_mod_incore_sb_batch() routine.
- * xfs_growfs can specify a few fields which are more than int limit
- */
-typedef struct xfs_mod_sb {
-	xfs_sb_field_t	msb_field;	/* Field to modify, see below */
-	int64_t		msb_delta;	/* Change to make to specified field */
-} xfs_mod_sb_t;
-
-/*
  * Per-ag incore structure, copies of information in agf and agi, to improve the
  * performance of allocation group selection.
  */
@@ -313,9 +304,6 @@ extern int	xfs_initialize_perag(xfs_mount_t *mp, xfs_agnumber_t agcount,
 				     xfs_agnumber_t *maxagi);
 
 extern void	xfs_unmountfs(xfs_mount_t *);
-extern int	xfs_mod_incore_sb(xfs_mount_t *, xfs_sb_field_t, int64_t, int);
-extern int	xfs_mod_incore_sb_batch(xfs_mount_t *, xfs_mod_sb_t *,
-			uint, int);
 extern int	xfs_mount_log_sb(xfs_mount_t *);
 extern struct xfs_buf *xfs_getsb(xfs_mount_t *, int);
 extern int	xfs_readsb(xfs_mount_t *, int);
@@ -327,6 +315,11 @@ extern int	xfs_dev_is_read_only(struct xfs_mount *, char *);
 
 extern void	xfs_set_low_space_thresholds(struct xfs_mount *);
 
+extern int	xfs_sb_mod_fdblocks(struct xfs_mount *, int64_t, bool);
+extern int	xfs_sb_mod_frextents(struct xfs_mount *, int64_t);
+extern int	xfs_sb_mod_icount(struct xfs_mount *, int64_t);
+extern int	xfs_sb_mod_ifree(struct xfs_mount *, int64_t);
+
 #endif	/* __KERNEL__ */
 
 #endif	/* __XFS_MOUNT_H__ */
diff --git a/fs/xfs/xfs_trans.c b/fs/xfs/xfs_trans.c
index b7da423..b704d5d 100644
--- a/fs/xfs/xfs_trans.c
+++ b/fs/xfs/xfs_trans.c
@@ -173,7 +173,7 @@ xfs_trans_reserve(
 	uint			rtextents)
 {
 	int		error = 0;
-	int		rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
+	bool		rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
 
 	/* Mark this thread as being in a transaction */
 	current_set_flags_nested(&tp->t_pflags, PF_FSTRANS);
@@ -184,8 +184,8 @@ xfs_trans_reserve(
 	 * fail if the count would go below zero.
 	 */
 	if (blocks > 0) {
-		error = xfs_mod_incore_sb(tp->t_mountp, XFS_SBS_FDBLOCKS,
-					  -((int64_t)blocks), rsvd);
+		error = xfs_sb_mod_fdblocks(tp->t_mountp,
+					-((int64_t)blocks), rsvd);
 		if (error != 0) {
 			current_restore_flags_nested(&tp->t_pflags, PF_FSTRANS);
 			return -ENOSPC;
@@ -236,8 +236,8 @@ xfs_trans_reserve(
 	 * fail if the count would go below zero.
 	 */
 	if (rtextents > 0) {
-		error = xfs_mod_incore_sb(tp->t_mountp, XFS_SBS_FREXTENTS,
-					  -((int64_t)rtextents), rsvd);
+		error = xfs_sb_mod_frextents(tp->t_mountp,
+					  -((int64_t)rtextents));
 		if (error) {
 			error = -ENOSPC;
 			goto undo_log;
@@ -268,8 +268,7 @@ undo_log:
 
 undo_blocks:
 	if (blocks > 0) {
-		xfs_mod_incore_sb(tp->t_mountp, XFS_SBS_FDBLOCKS,
-					 (int64_t)blocks, rsvd);
+		xfs_sb_mod_fdblocks(tp->t_mountp, (int64_t)blocks, rsvd);
 		tp->t_blk_res = 0;
 	}
 
@@ -488,6 +487,54 @@ xfs_trans_apply_sb_deltas(
 				  sizeof(sbp->sb_frextents) - 1);
 }
 
+STATIC int
+xfs_sb_mod8(
+	uint8_t			*field,
+	int8_t			delta)
+{
+	int8_t			counter = *field;
+
+	counter += delta;
+	if (counter < 0) {
+		ASSERT(0);
+		return -EINVAL;
+	}
+	*field = counter;
+	return 0;
+}
+
+STATIC int
+xfs_sb_mod32(
+	uint32_t		*field,
+	int32_t			delta)
+{
+	int32_t			counter = *field;
+
+	counter += delta;
+	if (counter < 0) {
+		ASSERT(0);
+		return -EINVAL;
+	}
+	*field = counter;
+	return 0;
+}
+
+STATIC int
+xfs_sb_mod64(
+	uint64_t		*field,
+	int64_t			delta)
+{
+	int64_t			counter = *field;
+
+	counter += delta;
+	if (counter < 0) {
+		ASSERT(0);
+		return -EINVAL;
+	}
+	*field = counter;
+	return 0;
+}
+
 /*
  * xfs_trans_unreserve_and_mod_sb() is called to release unused reservations
  * and apply superblock counter changes to the in-core superblock.  The
@@ -495,13 +542,6 @@ xfs_trans_apply_sb_deltas(
  * applied to the in-core superblock.  The idea is that that has already been
  * done.
  *
- * This is done efficiently with a single call to xfs_mod_incore_sb_batch().
- * However, we have to ensure that we only modify each superblock field only
- * once because the application of the delta values may not be atomic. That can
- * lead to ENOSPC races occurring if we have two separate modifcations of the
- * free space counter to put back the entire reservation and then take away
- * what we used.
- *
  * If we are not logging superblock counters, then the inode allocated/free and
  * used block counts are not updated in the on disk superblock. In this case,
  * XFS_TRANS_SB_DIRTY will not be set when the transaction is updated but we
@@ -511,18 +551,14 @@ void
 xfs_trans_unreserve_and_mod_sb(
 	xfs_trans_t	*tp)
 {
-	xfs_mod_sb_t	msb[9];	/* If you add cases, add entries */
-	xfs_mod_sb_t	*msbp;
 	xfs_mount_t	*mp = tp->t_mountp;
-	/* REFERENCED */
 	int		error;
-	int		rsvd;
+	bool		rsvd;
 	int64_t		blkdelta = 0;
 	int64_t		rtxdelta = 0;
 	int64_t		idelta = 0;
 	int64_t		ifreedelta = 0;
 
-	msbp = msb;
 	rsvd = (tp->t_flags & XFS_TRANS_RESERVE) != 0;
 
 	/* calculate deltas */
@@ -547,94 +583,110 @@ xfs_trans_unreserve_and_mod_sb(
 
 	/* apply the per-cpu counters */
 	if (blkdelta) {
-		error = xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, blkdelta, rsvd);
+		error = xfs_sb_mod_fdblocks(mp, blkdelta, rsvd);
 		if (error)
 			goto out;
 	}
 
 	if (idelta) {
-		error = xfs_mod_incore_sb(mp, XFS_SBS_ICOUNT, idelta, rsvd);
+		error = xfs_sb_mod_icount(mp, idelta);
 		if (error)
 			goto out_undo_fdblocks;
 	}
 
 	if (ifreedelta) {
-		error = xfs_mod_incore_sb(mp, XFS_SBS_IFREE, ifreedelta, rsvd);
+		error = xfs_sb_mod_ifree(mp, ifreedelta);
 		if (error)
 			goto out_undo_icount;
 	}
 
 	/* apply remaining deltas */
-	if (rtxdelta != 0) {
-		msbp->msb_field = XFS_SBS_FREXTENTS;
-		msbp->msb_delta = rtxdelta;
-		msbp++;
-	}
+	if (rtxdelta == 0 && !(tp->t_flags & XFS_TRANS_SB_DIRTY))
+		return;
 
-	if (tp->t_flags & XFS_TRANS_SB_DIRTY) {
-		if (tp->t_dblocks_delta != 0) {
-			msbp->msb_field = XFS_SBS_DBLOCKS;
-			msbp->msb_delta = tp->t_dblocks_delta;
-			msbp++;
-		}
-		if (tp->t_agcount_delta != 0) {
-			msbp->msb_field = XFS_SBS_AGCOUNT;
-			msbp->msb_delta = tp->t_agcount_delta;
-			msbp++;
-		}
-		if (tp->t_imaxpct_delta != 0) {
-			msbp->msb_field = XFS_SBS_IMAX_PCT;
-			msbp->msb_delta = tp->t_imaxpct_delta;
-			msbp++;
-		}
-		if (tp->t_rextsize_delta != 0) {
-			msbp->msb_field = XFS_SBS_REXTSIZE;
-			msbp->msb_delta = tp->t_rextsize_delta;
-			msbp++;
-		}
-		if (tp->t_rbmblocks_delta != 0) {
-			msbp->msb_field = XFS_SBS_RBMBLOCKS;
-			msbp->msb_delta = tp->t_rbmblocks_delta;
-			msbp++;
-		}
-		if (tp->t_rblocks_delta != 0) {
-			msbp->msb_field = XFS_SBS_RBLOCKS;
-			msbp->msb_delta = tp->t_rblocks_delta;
-			msbp++;
-		}
-		if (tp->t_rextents_delta != 0) {
-			msbp->msb_field = XFS_SBS_REXTENTS;
-			msbp->msb_delta = tp->t_rextents_delta;
-			msbp++;
-		}
-		if (tp->t_rextslog_delta != 0) {
-			msbp->msb_field = XFS_SBS_REXTSLOG;
-			msbp->msb_delta = tp->t_rextslog_delta;
-			msbp++;
-		}
+	spin_lock(&mp->m_sb_lock);
+	if (rtxdelta) {
+		error = xfs_sb_mod64(&mp->m_sb.sb_frextents, rtxdelta);
+		if (error)
+			goto out_undo_ifree;
 	}
-
-	/*
-	 * If we need to change anything, do it.
-	 */
-	if (msbp > msb) {
-		error = xfs_mod_incore_sb_batch(tp->t_mountp, msb,
-			(uint)(msbp - msb), rsvd);
+	if (tp->t_dblocks_delta != 0) {
+		error = xfs_sb_mod64(&mp->m_sb.sb_dblocks, tp->t_dblocks_delta);
 		if (error)
-			goto out_undo_ifreecount;
+			goto out_undo_frextents;
 	}
-
+	if (tp->t_agcount_delta != 0) {
+		error = xfs_sb_mod32(&mp->m_sb.sb_agcount, tp->t_agcount_delta);
+		if (error)
+			goto out_undo_dblocks;
+	}
+	if (tp->t_imaxpct_delta != 0) {
+		error = xfs_sb_mod8(&mp->m_sb.sb_imax_pct, tp->t_imaxpct_delta);
+		if (error)
+			goto out_undo_agcount;
+	}
+	if (tp->t_rextsize_delta != 0) {
+		error = xfs_sb_mod32(&mp->m_sb.sb_rextsize, tp->t_rextsize_delta);
+		if (error)
+			goto out_undo_imaxpct;
+	}
+	if (tp->t_rbmblocks_delta != 0) {
+		error = xfs_sb_mod32(&mp->m_sb.sb_rbmblocks, tp->t_rbmblocks_delta);
+		if (error)
+			goto out_undo_rextsize;
+	}
+	if (tp->t_rblocks_delta != 0) {
+		error = xfs_sb_mod64(&mp->m_sb.sb_rblocks, tp->t_rblocks_delta);
+		if (error)
+			goto out_undo_rbmblocks;
+	}
+	if (tp->t_rextents_delta != 0) {
+		error = xfs_sb_mod64(&mp->m_sb.sb_rextents, tp->t_rextents_delta);
+		if (error)
+			goto out_undo_rblocks;
+	}
+	if (tp->t_rextslog_delta != 0) {
+		error = xfs_sb_mod8(&mp->m_sb.sb_rextslog, tp->t_rextslog_delta);
+		if (error)
+			goto out_undo_rextents;
+	}
+	spin_unlock(&mp->m_sb_lock);
 	return;
 
-out_undo_ifreecount:
+out_undo_rextents:
+	if (tp->t_rextents_delta)
+		xfs_sb_mod64(&mp->m_sb.sb_rextents, -tp->t_rextents_delta);
+out_undo_rblocks:
+	if (tp->t_rblocks_delta)
+		xfs_sb_mod64(&mp->m_sb.sb_rblocks, -tp->t_rblocks_delta);
+out_undo_rbmblocks:
+	if (tp->t_rbmblocks_delta)
+		xfs_sb_mod32(&mp->m_sb.sb_rbmblocks, -tp->t_rbmblocks_delta);
+out_undo_rextsize:
+	if (tp->t_rextsize_delta)
+		xfs_sb_mod32(&mp->m_sb.sb_rextsize, -tp->t_rextsize_delta);
+out_undo_imaxpct:
+	if (tp->t_rextsize_delta)
+		xfs_sb_mod8(&mp->m_sb.sb_imax_pct, -tp->t_imaxpct_delta);
+out_undo_agcount:
+	if (tp->t_agcount_delta)
+		xfs_sb_mod32(&mp->m_sb.sb_agcount, -tp->t_agcount_delta);
+out_undo_dblocks:
+	if (tp->t_dblocks_delta)
+		xfs_sb_mod64(&mp->m_sb.sb_dblocks, -tp->t_dblocks_delta);
+out_undo_frextents:
+	if (rtxdelta)
+		xfs_sb_mod64(&mp->m_sb.sb_frextents, -rtxdelta);
+out_undo_ifree:
+	spin_unlock(&mp->m_sb_lock);
 	if (ifreedelta)
-		xfs_mod_incore_sb(mp, XFS_SBS_IFREE, -ifreedelta, rsvd);
+		xfs_sb_mod_ifree(mp, -ifreedelta);
 out_undo_icount:
 	if (idelta)
-		xfs_mod_incore_sb(mp, XFS_SBS_ICOUNT, -idelta, rsvd);
+		xfs_sb_mod_icount(mp, -idelta);
 out_undo_fdblocks:
 	if (blkdelta)
-		xfs_mod_incore_sb(mp, XFS_SBS_FDBLOCKS, -blkdelta, rsvd);
+		xfs_sb_mod_fdblocks(mp, -blkdelta, rsvd);
 out:
 	ASSERT(error == 0);
 	return;

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 0/5] xfs: use generic percpu counters for icsb
  2015-02-03 21:50 ` [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Christoph Hellwig
@ 2015-02-03 21:58   ` Dave Chinner
  2015-02-03 22:02     ` Christoph Hellwig
  0 siblings, 1 reply; 24+ messages in thread
From: Dave Chinner @ 2015-02-03 21:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Tue, Feb 03, 2015 at 01:50:43PM -0800, Christoph Hellwig wrote:
> FYI, I think we should just get rid of the horrible xfs_mod_incore_sb(_batch)
> interface as part of this.  The patch below applies on top of your
> series.

I like it - getting rid of those field bitmasks altogether is a
great result.

It probably needs to be split up into several patches, though. I can
do that over the next few days and add it to the patchset if you
want.  The icsb changes are 3.21 material, anyway, so there's no
real hurry on this....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 0/5] xfs: use generic percpu counters for icsb
  2015-02-03 21:58   ` Dave Chinner
@ 2015-02-03 22:02     ` Christoph Hellwig
  2015-02-03 22:13       ` Dave Chinner
  0 siblings, 1 reply; 24+ messages in thread
From: Christoph Hellwig @ 2015-02-03 22:02 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Christoph Hellwig, xfs

On Wed, Feb 04, 2015 at 08:58:50AM +1100, Dave Chinner wrote:
> It probably needs to be split up into several patches, though. I can
> do that over the next few days and add it to the patchset if you
> want.  The icsb changes are 3.21 material, anyway, so there's no
> real hurry on this....

Might be best if you fold the changes for the three counters that
your series touch into those patches.  Then I can split the remainder
up into one for the real time extents as the only one called from
code outside the transction subsystem, and then the final one to
redo the batch changess from the transaction code.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH 0/5] xfs: use generic percpu counters for icsb
  2015-02-03 22:02     ` Christoph Hellwig
@ 2015-02-03 22:13       ` Dave Chinner
  0 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2015-02-03 22:13 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Tue, Feb 03, 2015 at 02:02:08PM -0800, Christoph Hellwig wrote:
> On Wed, Feb 04, 2015 at 08:58:50AM +1100, Dave Chinner wrote:
> > It probably needs to be split up into several patches, though. I can
> > do that over the next few days and add it to the patchset if you
> > want.  The icsb changes are 3.21 material, anyway, so there's no
> > real hurry on this....
> 
> Might be best if you fold the changes for the three counters that
> your series touch into those patches.  Then I can split the remainder
> up into one for the real time extents as the only one called from
> code outside the transction subsystem, and then the final one to
> redo the batch changess from the transaction code.

Sounds like a plan to me.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format
  2015-02-03 21:46         ` Dave Chinner
@ 2015-02-03 23:34           ` Dave Chinner
  0 siblings, 0 replies; 24+ messages in thread
From: Dave Chinner @ 2015-02-03 23:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Wed, Feb 04, 2015 at 08:46:09AM +1100, Dave Chinner wrote:
> On Tue, Feb 03, 2015 at 01:37:44PM -0800, Christoph Hellwig wrote:
> > On Tue, Feb 03, 2015 at 06:30:21AM +1100, Dave Chinner wrote:
> > > > I'd expect to move it close to stuct xfs_mount, and maybe even merge
> > > > it into that in the long run.
> > > 
> > > I guess moving the structure there is fine, but we still want all
> > > the version functions to be shared with userspace, which then makes
> > > for an interesting set of dependencies. Any other ideas?
> > 
> > Are they really worth the sharing?  If they are worth it we'll
> > need somethign that can expect a xfs_sb/xfs_mount to be defined.
> 
> I suppose we could stop sharing them - they change rarely enough
> and it's only a few lines of code for each new feature that would
> then need to be duplicated. Not a huge burden...

Just a further thought on this - I might keep the per-cpu counters
in the struct mount. That way the to/from disk code only needs to
sum/set the per-cpu counter values to/from the m_sb as they
currently do and so the xfs_sb can remain unchanged for the moment.

That might be a cleaner way to start this patchset, especially as we
already have the per-cpu counter hooks in all the places we need
them.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2015-02-03 23:36 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-01 21:42 [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Dave Chinner
2015-02-01 21:42 ` [PATCH 1/5] xfs: struct xfs_sb is no longer tied to the on-disk format Dave Chinner
2015-02-02  8:41   ` Christoph Hellwig
2015-02-02 19:30     ` Dave Chinner
2015-02-03 21:37       ` Christoph Hellwig
2015-02-03 21:46         ` Dave Chinner
2015-02-03 23:34           ` Dave Chinner
2015-02-01 21:43 ` [PATCH 2/5] xfs: use generic percpu counters for inode counter Dave Chinner
2015-02-02 16:44   ` Christoph Hellwig
2015-02-02 19:33     ` Dave Chinner
2015-02-03 21:38       ` Christoph Hellwig
2015-02-01 21:43 ` [PATCH 3/5] xfs: use generic percpu counters for free " Dave Chinner
2015-02-02 17:10   ` Brian Foster
2015-02-01 21:43 ` [PATCH 4/5] xfs: use generic percpu counters for free block counter Dave Chinner
2015-02-02 16:48   ` Christoph Hellwig
2015-02-02 19:34     ` Dave Chinner
2015-02-02 17:11   ` Brian Foster
2015-02-02 19:39     ` Dave Chinner
2015-02-01 21:43 ` [PATCH 5/5] xfs: Remove icsb infrastructure Dave Chinner
2015-02-02 17:11   ` Brian Foster
2015-02-03 21:50 ` [RFC PATCH 0/5] xfs: use generic percpu counters for icsb Christoph Hellwig
2015-02-03 21:58   ` Dave Chinner
2015-02-03 22:02     ` Christoph Hellwig
2015-02-03 22:13       ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.