All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 00/10] speculative preallocation inode tracking
@ 2012-10-05 14:17 Brian Foster
  2012-10-05 14:17 ` [PATCH v5 01/10] xfs: add EOFBLOCKS inode tagging/untagging Brian Foster
                   ` (11 more replies)
  0 siblings, 12 replies; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

Hi all,

This is v3 of the speculative preallocation inode tracking patchset. This
functionality tracks inodes with post-EOF speculative preallocation for the
purpose of background and on-demand trimming.

Background scanning occurs on a longish interval (5 minutes by default) and in
a best-effort mode (i.e., inodes are skipped due to lock contention or dirty
cache). The intent is to clear up post-EOF blocks on inodes that might have
allocations hanging around due to open-write-close sequences (NFS).

On demand scanning is provided via a new ioctl and supports various parameters
such as scan mode, filtering by quota id and minimum file size. A pending use
case for on demand scanning is for accurate quota accounting via the gluster
scale out filesystem (i.e., to free up preallocated space when near a usage
limit).

Brian

v5:
- Rebased against the die-xfssyncd-die patchset.
- Renamed xfs_inodes_free_eofblocks() to xfs_icache_free_eofblocks().
- Migrated xfs_can_free_eofblocks() from static inline to xfs_inode.c.
- Fixed xfs_eofblocks version field and added version check to ioctl().
- Reworked quota filtering to use standard quota fields. Added patch 7 to
  support.
- Split min. file size filtering into a separate patch.
- Reworked background scanning based on rebase. Created a new eofblocks
  workqueue, renamed the tunable and based its value in seconds.

v4:
- Fix the broken need_iolock parameter to the xfs_free_eofblocks() call in
  patch 5 (thanks to Ben M. for testing). This is a one line change. All
  other patches equivalent to v3. This version survives an xfstests run (with
  XFS_DEBUG enabled this time :P).

v3:
- Pushed dirty cache check up into patch 5 (minor clean up).
- Reworked xfs_can_free_eofblocks() in patch 3 as per Dave C.'s review.
- Rebased from linus' tree to the XFS tree.

v2:
- Remove unnecessary inode flag clear helper.
- Condense eofblocks set/clear tag functions.
- Move clear tag call into xfs_free_eofblocks().
- Modify AG walk infrastructure to support tag-based walk and utilize this
  functionality for the eofblocks scan (as opposed to the previous code
  duplicated from reclaim scanning).
- Improve ioctl functionality: new data structure fields/flags, validate quota
  is enabled.
- Increase default background scanning interval to 5 minutes, add tunable.

Brian Foster (10):
  xfs: add EOFBLOCKS inode tagging/untagging
  xfs: support a tag-based inode_ag_iterator
  xfs: create helper to check whether to free eofblocks on inode
  xfs: make xfs_free_eofblocks() non-static, return EAGAIN on trylock
    failure
  xfs: create function to scan and clear EOFBLOCKS inodes
  xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  xfs: make xfs_quota_type() non-static
  xfs: add quota id filtering to eofblocks scan
  xfs: add minimum file size filtering to eofblocks scan
  xfs: add background scanning to clear eofblocks inodes

 fs/xfs/xfs_ag.h          |    1 +
 fs/xfs/xfs_fs.h          |   20 ++++
 fs/xfs/xfs_globals.c     |    1 +
 fs/xfs/xfs_icache.c      |  220 ++++++++++++++++++++++++++++++++++++++++++++--
 fs/xfs/xfs_icache.h      |   12 ++-
 fs/xfs/xfs_inode.c       |   37 ++++++++
 fs/xfs/xfs_inode.h       |    1 +
 fs/xfs/xfs_ioctl.c       |   24 +++++
 fs/xfs/xfs_iomap.c       |    8 ++
 fs/xfs/xfs_iops.c        |    4 +
 fs/xfs/xfs_linux.h       |    1 +
 fs/xfs/xfs_mount.c       |    2 +
 fs/xfs/xfs_mount.h       |    3 +
 fs/xfs/xfs_qm_syscalls.c |    5 +-
 fs/xfs/xfs_quota.h       |    1 +
 fs/xfs/xfs_quotaops.c    |    2 +-
 fs/xfs/xfs_super.c       |    9 ++
 fs/xfs/xfs_sysctl.c      |    9 ++
 fs/xfs/xfs_sysctl.h      |    1 +
 fs/xfs/xfs_trace.h       |    6 ++
 fs/xfs/xfs_vnodeops.c    |   27 +++---
 fs/xfs/xfs_vnodeops.h    |    1 +
 22 files changed, 368 insertions(+), 27 deletions(-)

-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v5 01/10] xfs: add EOFBLOCKS inode tagging/untagging
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-05 14:17 ` [PATCH v5 02/10] xfs: support a tag-based inode_ag_iterator Brian Foster
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

Add the XFS_ICI_EOFBLOCKS_TAG inode tag to identify inodes with
speculatively preallocated blocks beyond EOF. An inode is tagged
when speculative preallocation occurs and untagged either via
truncate down or when post-EOF blocks are freed via release or
reclaim.

The tag management is intentionally not aggressive to prefer
simplicity over the complexity of handling all the corner cases
under which post-EOF blocks could be freed (i.e., forward
truncation, fallocate, write error conditions, etc.). This means
that a tagged inode may or may not have post-EOF blocks after a
period of time. The tag is eventually cleared when the inode is
released or reclaimed.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_ag.h       |    1 +
 fs/xfs/xfs_icache.c   |   62 +++++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_icache.h   |    3 ++
 fs/xfs/xfs_iomap.c    |    8 ++++++
 fs/xfs/xfs_iops.c     |    4 +++
 fs/xfs/xfs_trace.h    |    5 ++++
 fs/xfs/xfs_vnodeops.c |    2 +
 7 files changed, 85 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/xfs_ag.h b/fs/xfs/xfs_ag.h
index 44d65c1..22bd4db 100644
--- a/fs/xfs/xfs_ag.h
+++ b/fs/xfs/xfs_ag.h
@@ -233,6 +233,7 @@ typedef struct xfs_perag {
 #define XFS_ICI_NO_TAG		(-1)	/* special flag for an untagged lookup
 					   in xfs_inode_ag_iterator */
 #define XFS_ICI_RECLAIM_TAG	0	/* inode is to be reclaimed */
+#define XFS_ICI_EOFBLOCKS_TAG	1	/* inode has blocks beyond EOF */
 
 #define	XFS_AG_MAXLEVELS(mp)		((mp)->m_ag_maxlevels)
 #define	XFS_MIN_FREELIST_RAW(bl,cl,mp)	\
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 9c8703b..f9afc5f 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1128,3 +1128,65 @@ xfs_reclaim_inodes_count(
 	return reclaimable;
 }
 
+void
+xfs_inode_set_eofblocks_tag(
+	xfs_inode_t	*ip)
+{
+	struct xfs_mount *mp = ip->i_mount;
+	struct xfs_perag *pag;
+	int tagged;
+
+	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
+	spin_lock(&pag->pag_ici_lock);
+	trace_xfs_inode_set_eofblocks_tag(ip);
+
+	tagged = radix_tree_tagged(&pag->pag_ici_root,
+				   XFS_ICI_EOFBLOCKS_TAG);
+	radix_tree_tag_set(&pag->pag_ici_root,
+			   XFS_INO_TO_AGINO(ip->i_mount, ip->i_ino),
+			   XFS_ICI_EOFBLOCKS_TAG);
+	if (!tagged) {
+		/* propagate the eofblocks tag up into the perag radix tree */
+		spin_lock(&ip->i_mount->m_perag_lock);
+		radix_tree_tag_set(&ip->i_mount->m_perag_tree,
+				   XFS_INO_TO_AGNO(ip->i_mount, ip->i_ino),
+				   XFS_ICI_EOFBLOCKS_TAG);
+		spin_unlock(&ip->i_mount->m_perag_lock);
+
+		trace_xfs_perag_set_eofblocks(ip->i_mount, pag->pag_agno,
+					      -1, _RET_IP_);
+	}
+
+	spin_unlock(&pag->pag_ici_lock);
+	xfs_perag_put(pag);
+}
+
+void
+xfs_inode_clear_eofblocks_tag(
+	xfs_inode_t	*ip)
+{
+	struct xfs_mount *mp = ip->i_mount;
+	struct xfs_perag *pag;
+
+	pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
+	spin_lock(&pag->pag_ici_lock);
+	trace_xfs_inode_clear_eofblocks_tag(ip);
+
+	radix_tree_tag_clear(&pag->pag_ici_root,
+			     XFS_INO_TO_AGINO(ip->i_mount, ip->i_ino),
+			     XFS_ICI_EOFBLOCKS_TAG);
+	if (!radix_tree_tagged(&pag->pag_ici_root, XFS_ICI_EOFBLOCKS_TAG)) {
+		/* clear the eofblocks tag from the perag radix tree */
+		spin_lock(&ip->i_mount->m_perag_lock);
+		radix_tree_tag_clear(&ip->i_mount->m_perag_tree,
+				     XFS_INO_TO_AGNO(ip->i_mount, ip->i_ino),
+				     XFS_ICI_EOFBLOCKS_TAG);
+		spin_unlock(&ip->i_mount->m_perag_lock);
+		trace_xfs_perag_clear_eofblocks(ip->i_mount, pag->pag_agno,
+					       -1, _RET_IP_);
+	}
+
+	spin_unlock(&pag->pag_ici_lock);
+	xfs_perag_put(pag);
+}
+
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index 222e22f..db36130 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -35,6 +35,9 @@ void xfs_reclaim_inodes_nr(struct xfs_mount *mp, int nr_to_scan);
 
 void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 
+void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
+void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
+
 int xfs_sync_inode_grab(struct xfs_inode *ip);
 int xfs_inode_ag_iterator(struct xfs_mount *mp,
 	int (*execute)(struct xfs_inode *ip, struct xfs_perag *pag, int flags),
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index f858b90..08b8777 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -41,6 +41,7 @@
 #include "xfs_utils.h"
 #include "xfs_iomap.h"
 #include "xfs_trace.h"
+#include "xfs_icache.h"
 
 
 #define XFS_WRITEIO_ALIGN(mp,off)	(((off) >> mp->m_writeio_log) \
@@ -450,6 +451,13 @@ retry:
 	if (!(imap[0].br_startblock || XFS_IS_REALTIME_INODE(ip)))
 		return xfs_alert_fsblock_zero(ip, &imap[0]);
 
+	/*
+	 * Tag the inode as speculatively preallocated so we can reclaim this
+	 * space on demand, if necessary.
+	 */
+	if (prealloc)
+		xfs_inode_set_eofblocks_tag(ip);
+
 	*ret_imap = imap[0];
 	return 0;
 }
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 4e00cf0..81f5c49 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -38,6 +38,7 @@
 #include "xfs_vnodeops.h"
 #include "xfs_inode_item.h"
 #include "xfs_trace.h"
+#include "xfs_icache.h"
 
 #include <linux/capability.h>
 #include <linux/xattr.h>
@@ -854,6 +855,9 @@ xfs_setattr_size(
 		 * and do not wait the usual (long) time for writeout.
 		 */
 		xfs_iflags_set(ip, XFS_ITRUNCATED);
+
+		/* A truncate down always removes post-EOF blocks. */
+		xfs_inode_clear_eofblocks_tag(ip);
 	}
 
 	if (mask & ATTR_CTIME) {
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 7d36ccf..6f46e03 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -130,6 +130,8 @@ DEFINE_PERAG_REF_EVENT(xfs_perag_get_tag);
 DEFINE_PERAG_REF_EVENT(xfs_perag_put);
 DEFINE_PERAG_REF_EVENT(xfs_perag_set_reclaim);
 DEFINE_PERAG_REF_EVENT(xfs_perag_clear_reclaim);
+DEFINE_PERAG_REF_EVENT(xfs_perag_set_eofblocks);
+DEFINE_PERAG_REF_EVENT(xfs_perag_clear_eofblocks);
 
 TRACE_EVENT(xfs_attr_list_node_descend,
 	TP_PROTO(struct xfs_attr_list_context *ctx,
@@ -585,6 +587,9 @@ DEFINE_INODE_EVENT(xfs_update_time);
 DEFINE_INODE_EVENT(xfs_dquot_dqalloc);
 DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
 
+DEFINE_INODE_EVENT(xfs_inode_set_eofblocks_tag);
+DEFINE_INODE_EVENT(xfs_inode_clear_eofblocks_tag);
+
 DECLARE_EVENT_CLASS(xfs_iref_class,
 	TP_PROTO(struct xfs_inode *ip, unsigned long caller_ip),
 	TP_ARGS(ip, caller_ip),
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index e9cae13..d8ce502 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -238,6 +238,8 @@ xfs_free_eofblocks(
 		} else {
 			error = xfs_trans_commit(tp,
 						XFS_TRANS_RELEASE_LOG_RES);
+			if (!error)
+				xfs_inode_clear_eofblocks_tag(ip);
 		}
 
 		xfs_iunlock(ip, XFS_ILOCK_EXCL);
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 02/10] xfs: support a tag-based inode_ag_iterator
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
  2012-10-05 14:17 ` [PATCH v5 01/10] xfs: add EOFBLOCKS inode tagging/untagging Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-05 14:17 ` [PATCH v5 03/10] xfs: create helper to check whether to free eofblocks on inode Brian Foster
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

Genericize xfs_inode_ag_walk() to support an optional radix tree tag
and args argument for the execute function. Create a new wrapper
called xfs_inode_ag_iterator_tag() that performs a tag based walk
of perag's and inodes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_icache.c      |   56 ++++++++++++++++++++++++++++++++++++++++-----
 fs/xfs/xfs_icache.h      |    7 ++++-
 fs/xfs/xfs_qm_syscalls.c |    5 ++-
 3 files changed, 57 insertions(+), 11 deletions(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index f9afc5f..878eb24 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -516,8 +516,11 @@ xfs_inode_ag_walk(
 	struct xfs_mount	*mp,
 	struct xfs_perag	*pag,
 	int			(*execute)(struct xfs_inode *ip,
-					   struct xfs_perag *pag, int flags),
-	int			flags)
+					   struct xfs_perag *pag, int flags,
+					   void *args),
+	int			flags,
+	void			*args,
+	int			tag)
 {
 	uint32_t		first_index;
 	int			last_error = 0;
@@ -536,9 +539,17 @@ restart:
 		int		i;
 
 		rcu_read_lock();
-		nr_found = radix_tree_gang_lookup(&pag->pag_ici_root,
+
+		if (tag == -1) 
+			nr_found = radix_tree_gang_lookup(&pag->pag_ici_root,
 					(void **)batch, first_index,
 					XFS_LOOKUP_BATCH);
+		else
+			nr_found = radix_tree_gang_lookup_tag(
+					&pag->pag_ici_root,
+					(void **) batch, first_index,
+					XFS_LOOKUP_BATCH, tag);
+
 		if (!nr_found) {
 			rcu_read_unlock();
 			break;
@@ -579,7 +590,7 @@ restart:
 		for (i = 0; i < nr_found; i++) {
 			if (!batch[i])
 				continue;
-			error = execute(batch[i], pag, flags);
+			error = execute(batch[i], pag, flags, args);
 			IRELE(batch[i]);
 			if (error == EAGAIN) {
 				skipped++;
@@ -608,8 +619,10 @@ int
 xfs_inode_ag_iterator(
 	struct xfs_mount	*mp,
 	int			(*execute)(struct xfs_inode *ip,
-					   struct xfs_perag *pag, int flags),
-	int			flags)
+					   struct xfs_perag *pag, int flags,
+					   void *args),
+	int			flags,
+	void			*args)
 {
 	struct xfs_perag	*pag;
 	int			error = 0;
@@ -619,7 +632,36 @@ xfs_inode_ag_iterator(
 	ag = 0;
 	while ((pag = xfs_perag_get(mp, ag))) {
 		ag = pag->pag_agno + 1;
-		error = xfs_inode_ag_walk(mp, pag, execute, flags);
+		error = xfs_inode_ag_walk(mp, pag, execute, flags, args, -1);
+		xfs_perag_put(pag);
+		if (error) {
+			last_error = error;
+			if (error == EFSCORRUPTED)
+				break;
+		}
+	}
+	return XFS_ERROR(last_error);
+}
+
+int
+xfs_inode_ag_iterator_tag(
+	struct xfs_mount	*mp,
+	int			(*execute)(struct xfs_inode *ip,
+					   struct xfs_perag *pag, int flags,
+					   void *args),
+	int			flags,
+	void			*args,
+	int			tag)
+{
+	struct xfs_perag	*pag;
+	int			error = 0;
+	int			last_error = 0;
+	xfs_agnumber_t		ag;
+
+	ag = 0;
+	while ((pag = xfs_perag_get_tag(mp, ag, tag))) {
+		ag = pag->pag_agno + 1;
+		error = xfs_inode_ag_walk(mp, pag, execute, flags, args, tag);
 		xfs_perag_put(pag);
 		if (error) {
 			last_error = error;
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index db36130..bb80237 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -40,7 +40,10 @@ void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
 
 int xfs_sync_inode_grab(struct xfs_inode *ip);
 int xfs_inode_ag_iterator(struct xfs_mount *mp,
-	int (*execute)(struct xfs_inode *ip, struct xfs_perag *pag, int flags),
-	int flags);
+	int (*execute)(struct xfs_inode *ip, struct xfs_perag *pag, int flags, void *args),
+	int flags, void *args);
+int xfs_inode_ag_iterator_tag(struct xfs_mount *mp,
+	int (*execute)(struct xfs_inode *ip, struct xfs_perag *pag, int flags, void *args),
+	int flags, void *args, int tag);
 
 #endif
diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c
index 7a9071f..5f53e75 100644
--- a/fs/xfs/xfs_qm_syscalls.c
+++ b/fs/xfs/xfs_qm_syscalls.c
@@ -846,7 +846,8 @@ STATIC int
 xfs_dqrele_inode(
 	struct xfs_inode	*ip,
 	struct xfs_perag	*pag,
-	int			flags)
+	int			flags,
+	void			*args)
 {
 	/* skip quota inodes */
 	if (ip == ip->i_mount->m_quotainfo->qi_uquotaip ||
@@ -882,5 +883,5 @@ xfs_qm_dqrele_all_inodes(
 	uint		 flags)
 {
 	ASSERT(mp->m_quotainfo);
-	xfs_inode_ag_iterator(mp, xfs_dqrele_inode, flags);
+	xfs_inode_ag_iterator(mp, xfs_dqrele_inode, flags, NULL);
 }
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 03/10] xfs: create helper to check whether to free eofblocks on inode
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
  2012-10-05 14:17 ` [PATCH v5 01/10] xfs: add EOFBLOCKS inode tagging/untagging Brian Foster
  2012-10-05 14:17 ` [PATCH v5 02/10] xfs: support a tag-based inode_ag_iterator Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-23  0:58   ` Dave Chinner
  2012-10-05 14:17 ` [PATCH v5 04/10] xfs: make xfs_free_eofblocks() non-static, return EAGAIN on trylock failure Brian Foster
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

This check is used in multiple places to determine whether we
should check for (and potentially free) post EOF blocks on an
inode. Add a helper to consolidate the check.

Note that when we remove an inode from the cache (xfs_inactive()),
we are required to trim post-EOF blocks even if the inode is marked
preallocated or append-only to maintain correct space accounting.
The 'force' parameter to xfs_can_free_eofblocks() specifies whether
we should ignore the prealloc/append-only status of the inode.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_inode.c    |   37 +++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_inode.h    |    1 +
 fs/xfs/xfs_vnodeops.c |   19 +++++++------------
 3 files changed, 45 insertions(+), 12 deletions(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index bba8f37..3e6cf11 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3911,3 +3911,40 @@ xfs_iext_irec_update_extoffs(
 		ifp->if_u1.if_ext_irec[i].er_extoff += ext_diff;
 	}
 }
+
+/*
+ * Test whether it is appropriate to check an inode for and free post EOF
+ * blocks. The 'force' parameter determines whether we should also consider
+ * regular files that are marked preallocated or append-only.
+ */
+bool
+xfs_can_free_eofblocks(struct xfs_inode *ip, bool force)
+{
+	/* prealloc/delalloc exists only on regular files */
+	if (!S_ISREG(ip->i_d.di_mode))
+		return false;
+
+	/*
+	 * Zero sized files with no cached pages and delalloc blocks will not
+	 * have speculative prealloc/delalloc blocks to remove.
+	 */
+	if (VFS_I(ip)->i_size == 0 &&
+	    VN_CACHED(VFS_I(ip)) == 0 &&
+	    ip->i_delayed_blks == 0)
+		return false;
+
+	/* If we haven't read in the extent list, then don't do it now. */
+	if (!(ip->i_df.if_flags & XFS_IFEXTENTS))
+		return false;
+
+	/*
+	 * Do not free real preallocated or append-only files unless the file
+	 * has delalloc blocks and we are forced to remove them.
+	 */
+	if (ip->i_d.di_flags & (XFS_DIFLAG_PREALLOC | XFS_DIFLAG_APPEND))
+		if (!force || ip->i_delayed_blks == 0)
+			return false;
+
+	return true;
+}
+
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index b3dabe9..0745c07 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -600,6 +600,7 @@ void		xfs_iext_irec_compact(xfs_ifork_t *);
 void		xfs_iext_irec_compact_pages(xfs_ifork_t *);
 void		xfs_iext_irec_compact_full(xfs_ifork_t *);
 void		xfs_iext_irec_update_extoffs(xfs_ifork_t *, int, int);
+bool		xfs_can_free_eofblocks(struct xfs_inode *, bool);
 
 #define xfs_ipincount(ip)	((unsigned int) atomic_read(&ip->i_pincount))
 
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index d8ce502..1f3c53d8 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -436,11 +436,7 @@ xfs_release(
 	if (ip->i_d.di_nlink == 0)
 		return 0;
 
-	if ((S_ISREG(ip->i_d.di_mode) &&
-	     (VFS_I(ip)->i_size > 0 ||
-	      (VN_CACHED(VFS_I(ip)) > 0 || ip->i_delayed_blks > 0)) &&
-	     (ip->i_df.if_flags & XFS_IFEXTENTS))  &&
-	    (!(ip->i_d.di_flags & (XFS_DIFLAG_PREALLOC | XFS_DIFLAG_APPEND)))) {
+	if (xfs_can_free_eofblocks(ip, false)) {
 
 		/*
 		 * If we can't get the iolock just skip truncating the blocks
@@ -516,13 +512,12 @@ xfs_inactive(
 		goto out;
 
 	if (ip->i_d.di_nlink != 0) {
-		if ((S_ISREG(ip->i_d.di_mode) &&
-		    (VFS_I(ip)->i_size > 0 ||
-		     (VN_CACHED(VFS_I(ip)) > 0 || ip->i_delayed_blks > 0)) &&
-		    (ip->i_df.if_flags & XFS_IFEXTENTS) &&
-		    (!(ip->i_d.di_flags &
-				(XFS_DIFLAG_PREALLOC | XFS_DIFLAG_APPEND)) ||
-		     ip->i_delayed_blks != 0))) {
+		/*
+		 * force is true because we are evicting an inode from the
+		 * cache. Post-eof blocks must be freed, lest we end up with
+		 * broken free space accounting.
+		 */
+		if (xfs_can_free_eofblocks(ip, true)) {
 			error = xfs_free_eofblocks(mp, ip, false);
 			if (error)
 				return VN_INACTIVE_CACHE;
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 04/10] xfs: make xfs_free_eofblocks() non-static, return EAGAIN on trylock failure
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
                   ` (2 preceding siblings ...)
  2012-10-05 14:17 ` [PATCH v5 03/10] xfs: create helper to check whether to free eofblocks on inode Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-05 14:17 ` [PATCH v5 05/10] xfs: create function to scan and clear EOFBLOCKS inodes Brian Foster
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

Turn xfs_free_eofblocks() into a non-static function, return EAGAIN to
indicate trylock failure and make sure this error is not propagated in
xfs_release().

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_vnodeops.c |    6 +++---
 fs/xfs/xfs_vnodeops.h |    1 +
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index 1f3c53d8..21665fa 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -151,7 +151,7 @@ xfs_readlink(
  * when the link count isn't zero and by xfs_dm_punch_hole() when
  * punching a hole to EOF.
  */
-STATIC int
+int
 xfs_free_eofblocks(
 	xfs_mount_t	*mp,
 	xfs_inode_t	*ip,
@@ -200,7 +200,7 @@ xfs_free_eofblocks(
 		if (need_iolock) {
 			if (!xfs_ilock_nowait(ip, XFS_IOLOCK_EXCL)) {
 				xfs_trans_cancel(tp, 0);
-				return 0;
+				return EAGAIN;
 			}
 		}
 
@@ -463,7 +463,7 @@ xfs_release(
 			return 0;
 
 		error = xfs_free_eofblocks(mp, ip, true);
-		if (error)
+		if (error && error != EAGAIN)
 			return error;
 
 		/* delalloc blocks after truncation means it really is dirty */
diff --git a/fs/xfs/xfs_vnodeops.h b/fs/xfs/xfs_vnodeops.h
index 447e146..52fafc4 100644
--- a/fs/xfs/xfs_vnodeops.h
+++ b/fs/xfs/xfs_vnodeops.h
@@ -57,5 +57,6 @@ int xfs_flush_pages(struct xfs_inode *ip, xfs_off_t first,
 int xfs_wait_on_pages(struct xfs_inode *ip, xfs_off_t first, xfs_off_t last);
 
 int xfs_zero_eof(struct xfs_inode *, xfs_off_t, xfs_fsize_t);
+int xfs_free_eofblocks(struct xfs_mount *, struct xfs_inode *, bool);
 
 #endif /* _XFS_VNODEOPS_H */
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 05/10] xfs: create function to scan and clear EOFBLOCKS inodes
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
                   ` (3 preceding siblings ...)
  2012-10-05 14:17 ` [PATCH v5 04/10] xfs: make xfs_free_eofblocks() non-static, return EAGAIN on trylock failure Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-23  1:01   ` Dave Chinner
  2012-10-05 14:17 ` [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl Brian Foster
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

xfs_inodes_free_eofblocks() implements scanning functionality for
EOFBLOCKS inodes. It uses the AG iterator to walk the tagged inodes
and free post-EOF blocks via the xfs_inode_free_eofblocks() execute
function. The scan can be invoked in best-effort mode or wait
(force) mode.

A best-effort scan (default) handles all inodes that do not have a
dirty cache and we successfully acquire the io lock via trylock. In
wait mode, we continue to cycle through an AG until all inodes are
handled.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_icache.c |   43 +++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_icache.h |    1 +
 fs/xfs/xfs_trace.h  |    1 +
 3 files changed, 45 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 878eb24..b02a3df 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1170,6 +1170,49 @@ xfs_reclaim_inodes_count(
 	return reclaimable;
 }
 
+STATIC int
+xfs_inode_free_eofblocks(
+	struct xfs_inode	*ip,
+	struct xfs_perag	*pag,
+	int			flags,
+	void			*args)
+{
+	int ret;
+
+	if (!xfs_can_free_eofblocks(ip, false)) {
+		/* inode could be preallocated or append-only */
+		trace_xfs_inode_free_eofblocks_invalid(ip);
+		xfs_inode_clear_eofblocks_tag(ip);
+		return 0;
+	}
+
+	/*
+	 * If the mapping is dirty the operation can block and wait for some
+	 * time. Unless we are waiting, skip it.
+	 */
+	if (!(flags & SYNC_WAIT) &&
+	    mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_DIRTY))
+		return 0;
+
+	ret = xfs_free_eofblocks(ip->i_mount, ip, true);
+
+	/* don't revisit the inode if we're not waiting */
+	if (ret == EAGAIN && !(flags & SYNC_WAIT))
+		ret = 0;
+
+	return ret;
+}
+
+int
+xfs_icache_free_eofblocks(
+	struct xfs_mount	*mp,
+	int			flags)
+{
+	ASSERT((flags & ~(SYNC_TRYLOCK|SYNC_WAIT)) == 0);
+	return xfs_inode_ag_iterator_tag(mp, xfs_inode_free_eofblocks, flags,
+					 NULL, XFS_ICI_EOFBLOCKS_TAG);
+}
+
 void
 xfs_inode_set_eofblocks_tag(
 	xfs_inode_t	*ip)
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index bb80237..abca9fb 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -37,6 +37,7 @@ void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 
 void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
 void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
+int xfs_icache_free_eofblocks(struct xfs_mount *, int);
 
 int xfs_sync_inode_grab(struct xfs_inode *ip);
 int xfs_inode_ag_iterator(struct xfs_mount *mp,
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 6f46e03..cb52346 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -589,6 +589,7 @@ DEFINE_INODE_EVENT(xfs_dquot_dqdetach);
 
 DEFINE_INODE_EVENT(xfs_inode_set_eofblocks_tag);
 DEFINE_INODE_EVENT(xfs_inode_clear_eofblocks_tag);
+DEFINE_INODE_EVENT(xfs_inode_free_eofblocks_invalid);
 
 DECLARE_EVENT_CLASS(xfs_iref_class,
 	TP_PROTO(struct xfs_inode *ip, unsigned long caller_ip),
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
                   ` (4 preceding siblings ...)
  2012-10-05 14:17 ` [PATCH v5 05/10] xfs: create function to scan and clear EOFBLOCKS inodes Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-11 14:13   ` Ben Myers
  2012-10-23  1:31   ` Dave Chinner
  2012-10-05 14:17 ` [PATCH v5 07/10] xfs: make xfs_quota_type() non-static Brian Foster
                   ` (5 subsequent siblings)
  11 siblings, 2 replies; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

The XFS_IOC_FREE_EOFBLOCKS ioctl allows users to invoke an EOFBLOCKS
scan. The xfs_eofblocks structure is defined to support the command
parameters (scan mode).

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_fs.h     |   15 +++++++++++++++
 fs/xfs/xfs_icache.c |    5 +++--
 fs/xfs/xfs_icache.h |    2 +-
 fs/xfs/xfs_ioctl.c  |   16 ++++++++++++++++
 4 files changed, 35 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
index c13fed8..5f48b3e 100644
--- a/fs/xfs/xfs_fs.h
+++ b/fs/xfs/xfs_fs.h
@@ -339,6 +339,20 @@ typedef struct xfs_error_injection {
 
 
 /*
+ * Speculative preallocation trimming.
+ */
+#define XFS_EOFBLOCKS_VERSION		1
+struct xfs_eofblocks {
+	__u32		eof_version;
+	__u32		eof_flags;
+	unsigned char	pad[12];
+};
+
+/* eof_flags values */
+#define XFS_EOF_FLAGS_SYNC		0x01	/* sync/wait mode scan */
+
+
+/*
  * The user-level Handle Request interface structure.
  */
 typedef struct xfs_fsop_handlereq {
@@ -456,6 +470,7 @@ typedef struct xfs_handle {
 /*	XFS_IOC_GETBIOSIZE ---- deprecated 47	   */
 #define XFS_IOC_GETBMAPX	_IOWR('X', 56, struct getbmap)
 #define XFS_IOC_ZERO_RANGE	_IOW ('X', 57, struct xfs_flock64)
+#define XFS_IOC_FREE_EOFBLOCKS	_IOR ('X', 58, struct xfs_eofblocks)
 
 /*
  * ioctl commands that replace IRIX syssgi()'s
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index b02a3df..35efdda 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1206,11 +1206,12 @@ xfs_inode_free_eofblocks(
 int
 xfs_icache_free_eofblocks(
 	struct xfs_mount	*mp,
-	int			flags)
+	int			flags,
+	struct xfs_eofblocks	*eofb)
 {
 	ASSERT((flags & ~(SYNC_TRYLOCK|SYNC_WAIT)) == 0);
 	return xfs_inode_ag_iterator_tag(mp, xfs_inode_free_eofblocks, flags,
-					 NULL, XFS_ICI_EOFBLOCKS_TAG);
+					 eofb, XFS_ICI_EOFBLOCKS_TAG);
 }
 
 void
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index abca9fb..b46716c 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -37,7 +37,7 @@ void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 
 void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
 void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
-int xfs_icache_free_eofblocks(struct xfs_mount *, int);
+int xfs_icache_free_eofblocks(struct xfs_mount *, int, struct xfs_eofblocks *);
 
 int xfs_sync_inode_grab(struct xfs_inode *ip);
 int xfs_inode_ag_iterator(struct xfs_mount *mp,
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 0e0232c..ad4352f 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -42,6 +42,7 @@
 #include "xfs_inode_item.h"
 #include "xfs_export.h"
 #include "xfs_trace.h"
+#include "xfs_icache.h"
 
 #include <linux/capability.h>
 #include <linux/dcache.h>
@@ -1602,6 +1603,21 @@ xfs_file_ioctl(
 		error = xfs_errortag_clearall(mp, 1);
 		return -error;
 
+	case XFS_IOC_FREE_EOFBLOCKS: {
+		struct xfs_eofblocks eofb;
+		int flags;
+
+		if (copy_from_user(&eofb, arg, sizeof(eofb)))
+			return -XFS_ERROR(EFAULT);
+
+		if (eofb.eof_version != XFS_EOFBLOCKS_VERSION)
+			return -XFS_ERROR(EINVAL);
+
+		flags = (eofb.eof_flags & XFS_EOF_FLAGS_SYNC) ? SYNC_WAIT : SYNC_TRYLOCK;
+		error = xfs_icache_free_eofblocks(mp, flags, &eofb);
+		return -error;
+	}
+
 	default:
 		return -ENOTTY;
 	}
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 07/10] xfs: make xfs_quota_type() non-static
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
                   ` (5 preceding siblings ...)
  2012-10-05 14:17 ` [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-23  1:31   ` Dave Chinner
  2012-10-05 14:17 ` [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan Brian Foster
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

Make xfs_quota_type() into a non-static function. This is to
support quota ID filtering in the eofblocks ioctl().

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_quota.h    |    1 +
 fs/xfs/xfs_quotaops.c |    2 +-
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/fs/xfs/xfs_quota.h b/fs/xfs/xfs_quota.h
index b50ec5b..3ec6224 100644
--- a/fs/xfs/xfs_quota.h
+++ b/fs/xfs/xfs_quota.h
@@ -379,6 +379,7 @@ static inline int xfs_trans_reserve_quota_bydquots(struct xfs_trans *tp,
 extern int xfs_qm_dqcheck(struct xfs_mount *, xfs_disk_dquot_t *,
 				xfs_dqid_t, uint, uint, char *);
 extern int xfs_mount_reset_sbqflags(struct xfs_mount *);
+int xfs_quota_type(int);
 
 #endif	/* __KERNEL__ */
 #endif	/* __XFS_QUOTA_H__ */
diff --git a/fs/xfs/xfs_quotaops.c b/fs/xfs/xfs_quotaops.c
index fed504f..fe15aa7 100644
--- a/fs/xfs/xfs_quotaops.c
+++ b/fs/xfs/xfs_quotaops.c
@@ -28,7 +28,7 @@
 #include <linux/quota.h>
 
 
-STATIC int
+int
 xfs_quota_type(int type)
 {
 	switch (type) {
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
                   ` (6 preceding siblings ...)
  2012-10-05 14:17 ` [PATCH v5 07/10] xfs: make xfs_quota_type() non-static Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-23  1:42   ` Dave Chinner
  2012-10-05 14:17 ` [PATCH v5 09/10] xfs: add minimum file size " Brian Foster
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

Support quota ID filtering in the eofblocks scan. The caller must
set the XFS_EOF_FLAGS_QUOTA flags bit, quota ID and quota type. The
associated quota type must be enabled on the mounted filesystem.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_fs.h     |    3 +++
 fs/xfs/xfs_icache.c |   24 ++++++++++++++++++++++++
 fs/xfs/xfs_ioctl.c  |    8 ++++++++
 3 files changed, 35 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
index 5f48b3e..6ef9111 100644
--- a/fs/xfs/xfs_fs.h
+++ b/fs/xfs/xfs_fs.h
@@ -345,11 +345,14 @@ typedef struct xfs_error_injection {
 struct xfs_eofblocks {
 	__u32		eof_version;
 	__u32		eof_flags;
+	__u32		eof_q_id;
+	__u32		eof_q_type;
 	unsigned char	pad[12];
 };
 
 /* eof_flags values */
 #define XFS_EOF_FLAGS_SYNC		0x01	/* sync/wait mode scan */
+#define XFS_EOF_FLAGS_QUOTA		0x02	/* filter by quota id */
 
 
 /*
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 35efdda..b39970b 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1171,6 +1171,22 @@ xfs_reclaim_inodes_count(
 }
 
 STATIC int
+xfs_inode_match_quota_id(
+	struct xfs_inode	*ip,
+	struct xfs_eofblocks	*eofb)
+{
+	unsigned int type = xfs_quota_type(eofb->eof_q_type);
+	if (type == XFS_DQ_USER)
+		return ip->i_d.di_uid == eofb->eof_q_id;
+	else if (type == XFS_DQ_GROUP)
+		return ip->i_d.di_gid == eofb->eof_q_id;
+	else if (type == XFS_DQ_PROJ)
+		return xfs_get_projid(ip) == eofb->eof_q_id;
+
+	return 0;
+}
+
+STATIC int
 xfs_inode_free_eofblocks(
 	struct xfs_inode	*ip,
 	struct xfs_perag	*pag,
@@ -1178,6 +1194,7 @@ xfs_inode_free_eofblocks(
 	void			*args)
 {
 	int ret;
+	struct xfs_eofblocks *eofb = args;
 
 	if (!xfs_can_free_eofblocks(ip, false)) {
 		/* inode could be preallocated or append-only */
@@ -1194,6 +1211,13 @@ xfs_inode_free_eofblocks(
 	    mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_DIRTY))
 		return 0;
 
+	if (eofb) {
+		/* filter by quota id */
+		if (eofb->eof_flags & XFS_EOF_FLAGS_QUOTA &&
+		    !xfs_inode_match_quota_id(ip, eofb))
+			return 0;
+	}
+
 	ret = xfs_free_eofblocks(ip->i_mount, ip, true);
 
 	/* don't revisit the inode if we're not waiting */
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index ad4352f..547363b 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1613,6 +1613,14 @@ xfs_file_ioctl(
 		if (eofb.eof_version != XFS_EOFBLOCKS_VERSION)
 			return -XFS_ERROR(EINVAL);
 
+		if (eofb.eof_flags & XFS_EOF_FLAGS_QUOTA) {
+			unsigned int type = xfs_quota_type(eofb.eof_q_type);
+			if ((type == XFS_DQ_USER && !XFS_IS_UQUOTA_ON(mp)) ||
+			    (type == XFS_DQ_GROUP && !XFS_IS_GQUOTA_ON(mp)) ||
+			    (type == XFS_DQ_PROJ && !XFS_IS_PQUOTA_ON(mp)))
+				return -XFS_ERROR(EINVAL);
+		}
+
 		flags = (eofb.eof_flags & XFS_EOF_FLAGS_SYNC) ? SYNC_WAIT : SYNC_TRYLOCK;
 		error = xfs_icache_free_eofblocks(mp, flags, &eofb);
 		return -error;
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 09/10] xfs: add minimum file size filtering to eofblocks scan
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
                   ` (7 preceding siblings ...)
  2012-10-05 14:17 ` [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-23  1:43   ` Dave Chinner
  2012-10-05 14:17 ` [PATCH v5 10/10] xfs: add background scanning to clear eofblocks inodes Brian Foster
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

Support minimum file size filtering in the eofblocks scan. The
caller must set the XFS_EOF_FLAGS_MINFILESIZE flags bit and minimum
file size value in bytes.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_fs.h     |    2 ++
 fs/xfs/xfs_icache.c |    5 +++++
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
index 6ef9111..6d42b0d 100644
--- a/fs/xfs/xfs_fs.h
+++ b/fs/xfs/xfs_fs.h
@@ -347,12 +347,14 @@ struct xfs_eofblocks {
 	__u32		eof_flags;
 	__u32		eof_q_id;
 	__u32		eof_q_type;
+	__u32		eof_min_file_size;
 	unsigned char	pad[12];
 };
 
 /* eof_flags values */
 #define XFS_EOF_FLAGS_SYNC		0x01	/* sync/wait mode scan */
 #define XFS_EOF_FLAGS_QUOTA		0x02	/* filter by quota id */
+#define XFS_EOF_FLAGS_MINFILESIZE	0x04	/* filter by min file size */
 
 
 /*
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index b39970b..ffd7f86 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1216,6 +1216,11 @@ xfs_inode_free_eofblocks(
 		if (eofb->eof_flags & XFS_EOF_FLAGS_QUOTA &&
 		    !xfs_inode_match_quota_id(ip, eofb))
 			return 0;
+
+		/* skip the inode if the file size is too small */
+		if (eofb->eof_flags & XFS_EOF_FLAGS_MINFILESIZE &&
+		    XFS_ISIZE(ip) < eofb->eof_min_file_size)
+			return 0;
 	}
 
 	ret = xfs_free_eofblocks(ip->i_mount, ip, true);
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v5 10/10] xfs: add background scanning to clear eofblocks inodes
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
                   ` (8 preceding siblings ...)
  2012-10-05 14:17 ` [PATCH v5 09/10] xfs: add minimum file size " Brian Foster
@ 2012-10-05 14:17 ` Brian Foster
  2012-10-23  1:55   ` Dave Chinner
  2012-10-19 21:02 ` [PATCH v5 00/10] speculative preallocation inode tracking Mark Tinguely
  2012-10-23 19:10 ` Ben Myers
  11 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-05 14:17 UTC (permalink / raw)
  To: xfs

Create a new mount workqueue and delayed_work to enable background
scanning and freeing of eofblocks inodes. The scanner kicks in once
speculative preallocation occurs and stops requeueing itself when
no eofblocks inodes exist.

The scan interval is based on the new
'background_prealloc_discard_period' tunable (default to 5m). The
background scanner performs unfiltered, best effort scans (which
skips inodes under lock contention or with a dirty cache mapping).

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 fs/xfs/xfs_globals.c |    1 +
 fs/xfs/xfs_icache.c  |   29 +++++++++++++++++++++++++++++
 fs/xfs/xfs_icache.h  |    1 +
 fs/xfs/xfs_linux.h   |    1 +
 fs/xfs/xfs_mount.c   |    2 ++
 fs/xfs/xfs_mount.h   |    3 +++
 fs/xfs/xfs_super.c   |    9 +++++++++
 fs/xfs/xfs_sysctl.c  |    9 +++++++++
 fs/xfs/xfs_sysctl.h  |    1 +
 9 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/fs/xfs/xfs_globals.c b/fs/xfs/xfs_globals.c
index 76e81cf..efea30a 100644
--- a/fs/xfs/xfs_globals.c
+++ b/fs/xfs/xfs_globals.c
@@ -40,4 +40,5 @@ xfs_param_t xfs_params = {
 	.rotorstep	= {	1,		1,		255	},
 	.inherit_nodfrg	= {	0,		1,		1	},
 	.fstrm_timer	= {	1,		30*100,		3600*100},
+	.eofb_timer	= {	1,		300,		7200},
 };
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index ffd7f86..1ab560c 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -615,6 +615,32 @@ restart:
 	return last_error;
 }
 
+/*
+ * Background scanning to trim post-EOF preallocated space. This is queued
+ * based on the 'background_prealloc_discard_period' tunable (5m by default).
+ */
+STATIC void
+xfs_queue_eofblocks(
+	struct xfs_mount *mp)
+{
+	rcu_read_lock();
+	if (radix_tree_tagged(&mp->m_perag_tree, XFS_ICI_EOFBLOCKS_TAG))
+		queue_delayed_work(mp->m_eofblocks_workqueue,
+				   &mp->m_eofblocks_work,
+				   msecs_to_jiffies(xfs_eofb_secs * 1000));
+	rcu_read_unlock();
+}
+
+void
+xfs_eofblocks_worker(
+	struct work_struct *work)
+{
+	struct xfs_mount *mp = container_of(to_delayed_work(work),
+				struct xfs_mount, m_eofblocks_work);
+	xfs_icache_free_eofblocks(mp, SYNC_TRYLOCK, NULL);
+	xfs_queue_eofblocks(mp);
+}
+
 int
 xfs_inode_ag_iterator(
 	struct xfs_mount	*mp,
@@ -1268,6 +1294,9 @@ xfs_inode_set_eofblocks_tag(
 				   XFS_ICI_EOFBLOCKS_TAG);
 		spin_unlock(&ip->i_mount->m_perag_lock);
 
+		/* kick off background trimming */
+		xfs_queue_eofblocks(ip->i_mount);
+
 		trace_xfs_perag_set_eofblocks(ip->i_mount, pag->pag_agno,
 					      -1, _RET_IP_);
 	}
diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
index b46716c..0089286 100644
--- a/fs/xfs/xfs_icache.h
+++ b/fs/xfs/xfs_icache.h
@@ -38,6 +38,7 @@ void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
 void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
 void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
 int xfs_icache_free_eofblocks(struct xfs_mount *, int, struct xfs_eofblocks *);
+void xfs_eofblocks_worker(struct work_struct *);
 
 int xfs_sync_inode_grab(struct xfs_inode *ip);
 int xfs_inode_ag_iterator(struct xfs_mount *mp,
diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h
index 828662f..0a134ca 100644
--- a/fs/xfs/xfs_linux.h
+++ b/fs/xfs/xfs_linux.h
@@ -118,6 +118,7 @@
 #define xfs_rotorstep		xfs_params.rotorstep.val
 #define xfs_inherit_nodefrag	xfs_params.inherit_nodfrg.val
 #define xfs_fstrm_centisecs	xfs_params.fstrm_timer.val
+#define xfs_eofb_secs		xfs_params.eofb_timer.val
 
 #define current_cpu()		(raw_smp_processor_id())
 #define current_pid()		(current->pid)
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 6f1c997..41ae7e1 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1428,6 +1428,8 @@ xfs_unmountfs(
 	__uint64_t		resblks;
 	int			error;
 
+	cancel_delayed_work_sync(&mp->m_eofblocks_work);
+
 	xfs_qm_unmount_quotas(mp);
 	xfs_rtunmount_inodes(mp);
 	IRELE(mp->m_rootip);
diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h
index a631ca3..dc306a0 100644
--- a/fs/xfs/xfs_mount.h
+++ b/fs/xfs/xfs_mount.h
@@ -196,6 +196,8 @@ typedef struct xfs_mount {
 #endif
 	struct xfs_mru_cache	*m_filestream;  /* per-mount filestream data */
 	struct delayed_work	m_reclaim_work;	/* background inode reclaim */
+	struct delayed_work	m_eofblocks_work; /* background eof blocks
+						     trimming */
 	__int64_t		m_update_flags;	/* sb flags we need to update
 						   on the next remount,rw */
 	struct shrinker		m_inode_shrink;	/* inode reclaim shrinker */
@@ -207,6 +209,7 @@ typedef struct xfs_mount {
 	struct workqueue_struct	*m_cil_workqueue;
 	struct workqueue_struct	*m_reclaim_workqueue;
 	struct workqueue_struct	*m_log_workqueue;
+	struct workqueue_struct *m_eofblocks_workqueue;
 } xfs_mount_t;
 
 /*
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 5e6c5b8..e6b3083 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -874,8 +874,15 @@ xfs_init_mount_workqueues(
 	if (!mp->m_log_workqueue)
 		goto out_destroy_reclaim;
 
+	mp->m_eofblocks_workqueue = alloc_workqueue("xfs-eofblocks/%s",
+			WQ_NON_REENTRANT, 0, mp->m_fsname);
+	if (!mp->m_eofblocks_workqueue)
+		goto out_destroy_log;
+
 	return 0;
 
+out_destroy_log:
+	destroy_workqueue(mp->m_log_workqueue);
 out_destroy_reclaim:
 	destroy_workqueue(mp->m_reclaim_workqueue);
 out_destroy_cil:
@@ -892,6 +899,7 @@ STATIC void
 xfs_destroy_mount_workqueues(
 	struct xfs_mount	*mp)
 {
+	destroy_workqueue(mp->m_eofblocks_workqueue);
 	destroy_workqueue(mp->m_log_workqueue);
 	destroy_workqueue(mp->m_reclaim_workqueue);
 	destroy_workqueue(mp->m_cil_workqueue);
@@ -1375,6 +1383,7 @@ xfs_fs_fill_super(
 	mutex_init(&mp->m_growlock);
 	atomic_set(&mp->m_active_trans, 0);
 	INIT_DELAYED_WORK(&mp->m_reclaim_work, xfs_reclaim_worker);
+	INIT_DELAYED_WORK(&mp->m_eofblocks_work, xfs_eofblocks_worker);
 
 	mp->m_super = sb;
 	sb->s_fs_info = mp;
diff --git a/fs/xfs/xfs_sysctl.c b/fs/xfs/xfs_sysctl.c
index ee2d2ad..3136d45 100644
--- a/fs/xfs/xfs_sysctl.c
+++ b/fs/xfs/xfs_sysctl.c
@@ -202,6 +202,15 @@ static ctl_table xfs_table[] = {
 		.extra1		= &xfs_params.fstrm_timer.min,
 		.extra2		= &xfs_params.fstrm_timer.max,
 	},
+	{
+		.procname	= "background_prealloc_discard_period",
+		.data		= &xfs_params.eofb_timer.val,
+		.maxlen		= sizeof(int),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &xfs_params.eofb_timer.min,
+		.extra2		= &xfs_params.eofb_timer.max,
+	},
 	/* please keep this the last entry */
 #ifdef CONFIG_PROC_FS
 	{
diff --git a/fs/xfs/xfs_sysctl.h b/fs/xfs/xfs_sysctl.h
index b9937d4..bd8e157 100644
--- a/fs/xfs/xfs_sysctl.h
+++ b/fs/xfs/xfs_sysctl.h
@@ -47,6 +47,7 @@ typedef struct xfs_param {
 	xfs_sysctl_val_t rotorstep;	/* inode32 AG rotoring control knob */
 	xfs_sysctl_val_t inherit_nodfrg;/* Inherit the "nodefrag" inode flag. */
 	xfs_sysctl_val_t fstrm_timer;	/* Filestream dir-AG assoc'n timeout. */
+	xfs_sysctl_val_t eofb_timer;	/* Interval between eofb scan wakeups */
 } xfs_param_t;
 
 /*
-- 
1.7.7.6

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-05 14:17 ` [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl Brian Foster
@ 2012-10-11 14:13   ` Ben Myers
  2012-10-11 22:35     ` Brian Foster
  2012-10-23  1:31   ` Dave Chinner
  1 sibling, 1 reply; 43+ messages in thread
From: Ben Myers @ 2012-10-11 14:13 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

Hey Brian,

On Fri, Oct 05, 2012 at 10:17:12AM -0400, Brian Foster wrote:
> The XFS_IOC_FREE_EOFBLOCKS ioctl allows users to invoke an EOFBLOCKS
> scan. The xfs_eofblocks structure is defined to support the command
> parameters (scan mode).

It would help to have an xfstest to exercise this ioctl to pull in with this
series.  Do you have any code that could be wrangled into a test case?

Regards,
	Ben

> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_fs.h     |   15 +++++++++++++++
>  fs/xfs/xfs_icache.c |    5 +++--
>  fs/xfs/xfs_icache.h |    2 +-
>  fs/xfs/xfs_ioctl.c  |   16 ++++++++++++++++
>  4 files changed, 35 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
> index c13fed8..5f48b3e 100644
> --- a/fs/xfs/xfs_fs.h
> +++ b/fs/xfs/xfs_fs.h
> @@ -339,6 +339,20 @@ typedef struct xfs_error_injection {
>  
>  
>  /*
> + * Speculative preallocation trimming.
> + */
> +#define XFS_EOFBLOCKS_VERSION		1
> +struct xfs_eofblocks {
> +	__u32		eof_version;
> +	__u32		eof_flags;
> +	unsigned char	pad[12];
> +};
> +
> +/* eof_flags values */
> +#define XFS_EOF_FLAGS_SYNC		0x01	/* sync/wait mode scan */
> +
> +
> +/*
>   * The user-level Handle Request interface structure.
>   */
>  typedef struct xfs_fsop_handlereq {
> @@ -456,6 +470,7 @@ typedef struct xfs_handle {
>  /*	XFS_IOC_GETBIOSIZE ---- deprecated 47	   */
>  #define XFS_IOC_GETBMAPX	_IOWR('X', 56, struct getbmap)
>  #define XFS_IOC_ZERO_RANGE	_IOW ('X', 57, struct xfs_flock64)
> +#define XFS_IOC_FREE_EOFBLOCKS	_IOR ('X', 58, struct xfs_eofblocks)
>  
>  /*
>   * ioctl commands that replace IRIX syssgi()'s
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index b02a3df..35efdda 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -1206,11 +1206,12 @@ xfs_inode_free_eofblocks(
>  int
>  xfs_icache_free_eofblocks(
>  	struct xfs_mount	*mp,
> -	int			flags)
> +	int			flags,
> +	struct xfs_eofblocks	*eofb)
>  {
>  	ASSERT((flags & ~(SYNC_TRYLOCK|SYNC_WAIT)) == 0);
>  	return xfs_inode_ag_iterator_tag(mp, xfs_inode_free_eofblocks, flags,
> -					 NULL, XFS_ICI_EOFBLOCKS_TAG);
> +					 eofb, XFS_ICI_EOFBLOCKS_TAG);
>  }
>  
>  void
> diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> index abca9fb..b46716c 100644
> --- a/fs/xfs/xfs_icache.h
> +++ b/fs/xfs/xfs_icache.h
> @@ -37,7 +37,7 @@ void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
>  
>  void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
>  void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
> -int xfs_icache_free_eofblocks(struct xfs_mount *, int);
> +int xfs_icache_free_eofblocks(struct xfs_mount *, int, struct xfs_eofblocks *);
>  
>  int xfs_sync_inode_grab(struct xfs_inode *ip);
>  int xfs_inode_ag_iterator(struct xfs_mount *mp,
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 0e0232c..ad4352f 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -42,6 +42,7 @@
>  #include "xfs_inode_item.h"
>  #include "xfs_export.h"
>  #include "xfs_trace.h"
> +#include "xfs_icache.h"
>  
>  #include <linux/capability.h>
>  #include <linux/dcache.h>
> @@ -1602,6 +1603,21 @@ xfs_file_ioctl(
>  		error = xfs_errortag_clearall(mp, 1);
>  		return -error;
>  
> +	case XFS_IOC_FREE_EOFBLOCKS: {
> +		struct xfs_eofblocks eofb;
> +		int flags;
> +
> +		if (copy_from_user(&eofb, arg, sizeof(eofb)))
> +			return -XFS_ERROR(EFAULT);
> +
> +		if (eofb.eof_version != XFS_EOFBLOCKS_VERSION)
> +			return -XFS_ERROR(EINVAL);
> +
> +		flags = (eofb.eof_flags & XFS_EOF_FLAGS_SYNC) ? SYNC_WAIT : SYNC_TRYLOCK;
> +		error = xfs_icache_free_eofblocks(mp, flags, &eofb);
> +		return -error;
> +	}
> +
>  	default:
>  		return -ENOTTY;
>  	}
> -- 
> 1.7.7.6
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-11 14:13   ` Ben Myers
@ 2012-10-11 22:35     ` Brian Foster
  2012-10-15 22:46       ` Ben Myers
  0 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-11 22:35 UTC (permalink / raw)
  To: Ben Myers; +Cc: xfs

On 10/11/2012 10:13 AM, Ben Myers wrote:
> Hey Brian,
> 
> On Fri, Oct 05, 2012 at 10:17:12AM -0400, Brian Foster wrote:
>> The XFS_IOC_FREE_EOFBLOCKS ioctl allows users to invoke an EOFBLOCKS
>> scan. The xfs_eofblocks structure is defined to support the command
>> parameters (scan mode).
> 
> It would help to have an xfstest to exercise this ioctl to pull in with this
> series.  Do you have any code that could be wrangled into a test case?
> 

Yes, makes sense. I have some very basic test code I could put somewhere
to invoke the ioctl(). One of the questions I've been meaning to ask is
whether it would be relevant for that code to live in a common tool,
such as adding a new command to xfs_io. Then perhaps create an xfstests
test using that. Thoughts?

FYI, I have a few other things on my plate at the moment so
unfortunately it will be a bit before I can get back to XFS work... But
I'm fine with the set pending until I can come up with some test
coverage if that is preferable, of course.

Brian

> Regards,
> 	Ben
> 
>>
>> Signed-off-by: Brian Foster <bfoster@redhat.com>
>> ---
>>  fs/xfs/xfs_fs.h     |   15 +++++++++++++++
>>  fs/xfs/xfs_icache.c |    5 +++--
>>  fs/xfs/xfs_icache.h |    2 +-
>>  fs/xfs/xfs_ioctl.c  |   16 ++++++++++++++++
>>  4 files changed, 35 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
>> index c13fed8..5f48b3e 100644
>> --- a/fs/xfs/xfs_fs.h
>> +++ b/fs/xfs/xfs_fs.h
>> @@ -339,6 +339,20 @@ typedef struct xfs_error_injection {
>>  
>>  
>>  /*
>> + * Speculative preallocation trimming.
>> + */
>> +#define XFS_EOFBLOCKS_VERSION		1
>> +struct xfs_eofblocks {
>> +	__u32		eof_version;
>> +	__u32		eof_flags;
>> +	unsigned char	pad[12];
>> +};
>> +
>> +/* eof_flags values */
>> +#define XFS_EOF_FLAGS_SYNC		0x01	/* sync/wait mode scan */
>> +
>> +
>> +/*
>>   * The user-level Handle Request interface structure.
>>   */
>>  typedef struct xfs_fsop_handlereq {
>> @@ -456,6 +470,7 @@ typedef struct xfs_handle {
>>  /*	XFS_IOC_GETBIOSIZE ---- deprecated 47	   */
>>  #define XFS_IOC_GETBMAPX	_IOWR('X', 56, struct getbmap)
>>  #define XFS_IOC_ZERO_RANGE	_IOW ('X', 57, struct xfs_flock64)
>> +#define XFS_IOC_FREE_EOFBLOCKS	_IOR ('X', 58, struct xfs_eofblocks)
>>  
>>  /*
>>   * ioctl commands that replace IRIX syssgi()'s
>> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
>> index b02a3df..35efdda 100644
>> --- a/fs/xfs/xfs_icache.c
>> +++ b/fs/xfs/xfs_icache.c
>> @@ -1206,11 +1206,12 @@ xfs_inode_free_eofblocks(
>>  int
>>  xfs_icache_free_eofblocks(
>>  	struct xfs_mount	*mp,
>> -	int			flags)
>> +	int			flags,
>> +	struct xfs_eofblocks	*eofb)
>>  {
>>  	ASSERT((flags & ~(SYNC_TRYLOCK|SYNC_WAIT)) == 0);
>>  	return xfs_inode_ag_iterator_tag(mp, xfs_inode_free_eofblocks, flags,
>> -					 NULL, XFS_ICI_EOFBLOCKS_TAG);
>> +					 eofb, XFS_ICI_EOFBLOCKS_TAG);
>>  }
>>  
>>  void
>> diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
>> index abca9fb..b46716c 100644
>> --- a/fs/xfs/xfs_icache.h
>> +++ b/fs/xfs/xfs_icache.h
>> @@ -37,7 +37,7 @@ void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
>>  
>>  void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
>>  void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
>> -int xfs_icache_free_eofblocks(struct xfs_mount *, int);
>> +int xfs_icache_free_eofblocks(struct xfs_mount *, int, struct xfs_eofblocks *);
>>  
>>  int xfs_sync_inode_grab(struct xfs_inode *ip);
>>  int xfs_inode_ag_iterator(struct xfs_mount *mp,
>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>> index 0e0232c..ad4352f 100644
>> --- a/fs/xfs/xfs_ioctl.c
>> +++ b/fs/xfs/xfs_ioctl.c
>> @@ -42,6 +42,7 @@
>>  #include "xfs_inode_item.h"
>>  #include "xfs_export.h"
>>  #include "xfs_trace.h"
>> +#include "xfs_icache.h"
>>  
>>  #include <linux/capability.h>
>>  #include <linux/dcache.h>
>> @@ -1602,6 +1603,21 @@ xfs_file_ioctl(
>>  		error = xfs_errortag_clearall(mp, 1);
>>  		return -error;
>>  
>> +	case XFS_IOC_FREE_EOFBLOCKS: {
>> +		struct xfs_eofblocks eofb;
>> +		int flags;
>> +
>> +		if (copy_from_user(&eofb, arg, sizeof(eofb)))
>> +			return -XFS_ERROR(EFAULT);
>> +
>> +		if (eofb.eof_version != XFS_EOFBLOCKS_VERSION)
>> +			return -XFS_ERROR(EINVAL);
>> +
>> +		flags = (eofb.eof_flags & XFS_EOF_FLAGS_SYNC) ? SYNC_WAIT : SYNC_TRYLOCK;
>> +		error = xfs_icache_free_eofblocks(mp, flags, &eofb);
>> +		return -error;
>> +	}
>> +
>>  	default:
>>  		return -ENOTTY;
>>  	}
>> -- 
>> 1.7.7.6
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-11 22:35     ` Brian Foster
@ 2012-10-15 22:46       ` Ben Myers
  2012-10-15 23:49         ` Dave Chinner
  0 siblings, 1 reply; 43+ messages in thread
From: Ben Myers @ 2012-10-15 22:46 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

Hey Brian,

On Thu, Oct 11, 2012 at 06:35:14PM -0400, Brian Foster wrote:
> On 10/11/2012 10:13 AM, Ben Myers wrote:
> > Hey Brian,
> > 
> > On Fri, Oct 05, 2012 at 10:17:12AM -0400, Brian Foster wrote:
> >> The XFS_IOC_FREE_EOFBLOCKS ioctl allows users to invoke an EOFBLOCKS
> >> scan. The xfs_eofblocks structure is defined to support the command
> >> parameters (scan mode).
> > 
> > It would help to have an xfstest to exercise this ioctl to pull in with this
> > series.  Do you have any code that could be wrangled into a test case?
> > 
> 
> Yes, makes sense. I have some very basic test code I could put somewhere
> to invoke the ioctl(). One of the questions I've been meaning to ask is
> whether it would be relevant for that code to live in a common tool,
> such as adding a new command to xfs_io. Then perhaps create an xfstests
> test using that. Thoughts?

IMO you are right on the mark.  xfs_io is a great place for this.

> FYI, I have a few other things on my plate at the moment so
> unfortunately it will be a bit before I can get back to XFS work... But
> I'm fine with the set pending until I can come up with some test
> coverage if that is preferable, of course.

I do think it is preferable to have a test case go in with the code where
possible.  Since you don't mind waiting a bit, that seems to be the way to go.
The other option could be to look for a volunteer to work on the test.  ;)

Regards,
	Ben

> Brian
> 
> > Regards,
> > 	Ben
> > 
> >>
> >> Signed-off-by: Brian Foster <bfoster@redhat.com>
> >> ---
> >>  fs/xfs/xfs_fs.h     |   15 +++++++++++++++
> >>  fs/xfs/xfs_icache.c |    5 +++--
> >>  fs/xfs/xfs_icache.h |    2 +-
> >>  fs/xfs/xfs_ioctl.c  |   16 ++++++++++++++++
> >>  4 files changed, 35 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
> >> index c13fed8..5f48b3e 100644
> >> --- a/fs/xfs/xfs_fs.h
> >> +++ b/fs/xfs/xfs_fs.h
> >> @@ -339,6 +339,20 @@ typedef struct xfs_error_injection {
> >>  
> >>  
> >>  /*
> >> + * Speculative preallocation trimming.
> >> + */
> >> +#define XFS_EOFBLOCKS_VERSION		1
> >> +struct xfs_eofblocks {
> >> +	__u32		eof_version;
> >> +	__u32		eof_flags;
> >> +	unsigned char	pad[12];
> >> +};
> >> +
> >> +/* eof_flags values */
> >> +#define XFS_EOF_FLAGS_SYNC		0x01	/* sync/wait mode scan */
> >> +
> >> +
> >> +/*
> >>   * The user-level Handle Request interface structure.
> >>   */
> >>  typedef struct xfs_fsop_handlereq {
> >> @@ -456,6 +470,7 @@ typedef struct xfs_handle {
> >>  /*	XFS_IOC_GETBIOSIZE ---- deprecated 47	   */
> >>  #define XFS_IOC_GETBMAPX	_IOWR('X', 56, struct getbmap)
> >>  #define XFS_IOC_ZERO_RANGE	_IOW ('X', 57, struct xfs_flock64)
> >> +#define XFS_IOC_FREE_EOFBLOCKS	_IOR ('X', 58, struct xfs_eofblocks)
> >>  
> >>  /*
> >>   * ioctl commands that replace IRIX syssgi()'s
> >> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> >> index b02a3df..35efdda 100644
> >> --- a/fs/xfs/xfs_icache.c
> >> +++ b/fs/xfs/xfs_icache.c
> >> @@ -1206,11 +1206,12 @@ xfs_inode_free_eofblocks(
> >>  int
> >>  xfs_icache_free_eofblocks(
> >>  	struct xfs_mount	*mp,
> >> -	int			flags)
> >> +	int			flags,
> >> +	struct xfs_eofblocks	*eofb)
> >>  {
> >>  	ASSERT((flags & ~(SYNC_TRYLOCK|SYNC_WAIT)) == 0);
> >>  	return xfs_inode_ag_iterator_tag(mp, xfs_inode_free_eofblocks, flags,
> >> -					 NULL, XFS_ICI_EOFBLOCKS_TAG);
> >> +					 eofb, XFS_ICI_EOFBLOCKS_TAG);
> >>  }
> >>  
> >>  void
> >> diff --git a/fs/xfs/xfs_icache.h b/fs/xfs/xfs_icache.h
> >> index abca9fb..b46716c 100644
> >> --- a/fs/xfs/xfs_icache.h
> >> +++ b/fs/xfs/xfs_icache.h
> >> @@ -37,7 +37,7 @@ void xfs_inode_set_reclaim_tag(struct xfs_inode *ip);
> >>  
> >>  void xfs_inode_set_eofblocks_tag(struct xfs_inode *ip);
> >>  void xfs_inode_clear_eofblocks_tag(struct xfs_inode *ip);
> >> -int xfs_icache_free_eofblocks(struct xfs_mount *, int);
> >> +int xfs_icache_free_eofblocks(struct xfs_mount *, int, struct xfs_eofblocks *);
> >>  
> >>  int xfs_sync_inode_grab(struct xfs_inode *ip);
> >>  int xfs_inode_ag_iterator(struct xfs_mount *mp,
> >> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> >> index 0e0232c..ad4352f 100644
> >> --- a/fs/xfs/xfs_ioctl.c
> >> +++ b/fs/xfs/xfs_ioctl.c
> >> @@ -42,6 +42,7 @@
> >>  #include "xfs_inode_item.h"
> >>  #include "xfs_export.h"
> >>  #include "xfs_trace.h"
> >> +#include "xfs_icache.h"
> >>  
> >>  #include <linux/capability.h>
> >>  #include <linux/dcache.h>
> >> @@ -1602,6 +1603,21 @@ xfs_file_ioctl(
> >>  		error = xfs_errortag_clearall(mp, 1);
> >>  		return -error;
> >>  
> >> +	case XFS_IOC_FREE_EOFBLOCKS: {
> >> +		struct xfs_eofblocks eofb;
> >> +		int flags;
> >> +
> >> +		if (copy_from_user(&eofb, arg, sizeof(eofb)))
> >> +			return -XFS_ERROR(EFAULT);
> >> +
> >> +		if (eofb.eof_version != XFS_EOFBLOCKS_VERSION)
> >> +			return -XFS_ERROR(EINVAL);
> >> +
> >> +		flags = (eofb.eof_flags & XFS_EOF_FLAGS_SYNC) ? SYNC_WAIT : SYNC_TRYLOCK;
> >> +		error = xfs_icache_free_eofblocks(mp, flags, &eofb);
> >> +		return -error;
> >> +	}
> >> +
> >>  	default:
> >>  		return -ENOTTY;
> >>  	}
> >> -- 
> >> 1.7.7.6
> >>
> >> _______________________________________________
> >> xfs mailing list
> >> xfs@oss.sgi.com
> >> http://oss.sgi.com/mailman/listinfo/xfs
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-15 22:46       ` Ben Myers
@ 2012-10-15 23:49         ` Dave Chinner
  2012-10-16  1:39           ` Dave Chinner
  0 siblings, 1 reply; 43+ messages in thread
From: Dave Chinner @ 2012-10-15 23:49 UTC (permalink / raw)
  To: Ben Myers; +Cc: Brian Foster, xfs

On Mon, Oct 15, 2012 at 05:46:26PM -0500, Ben Myers wrote:
> Hey Brian,
> 
> On Thu, Oct 11, 2012 at 06:35:14PM -0400, Brian Foster wrote:
> > On 10/11/2012 10:13 AM, Ben Myers wrote:
> > > Hey Brian,
> > > 
> > > On Fri, Oct 05, 2012 at 10:17:12AM -0400, Brian Foster wrote:
> > >> The XFS_IOC_FREE_EOFBLOCKS ioctl allows users to invoke an EOFBLOCKS
> > >> scan. The xfs_eofblocks structure is defined to support the command
> > >> parameters (scan mode).
> > > 
> > > It would help to have an xfstest to exercise this ioctl to pull in with this
> > > series.  Do you have any code that could be wrangled into a test case?
> > > 
> > 
> > Yes, makes sense. I have some very basic test code I could put somewhere
> > to invoke the ioctl(). One of the questions I've been meaning to ask is
> > whether it would be relevant for that code to live in a common tool,
> > such as adding a new command to xfs_io. Then perhaps create an xfstests
> > test using that. Thoughts?
> 
> IMO you are right on the mark.  xfs_io is a great place for this.
> 
> > FYI, I have a few other things on my plate at the moment so
> > unfortunately it will be a bit before I can get back to XFS work... But
> > I'm fine with the set pending until I can come up with some test
> > coverage if that is preferable, of course.
> 
> I do think it is preferable to have a test case go in with the code where
> possible.  Since you don't mind waiting a bit, that seems to be the way to go.
> The other option could be to look for a volunteer to work on the test.  ;)

FWIW, given the background cleanup code can be trivially verified to
work (open, apend, close, repeat, wait 5 minutes) and is the
functionality that is needed in mainline, having something to test
the ioctls should not stop the patchset from being merged.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-15 23:49         ` Dave Chinner
@ 2012-10-16  1:39           ` Dave Chinner
  2012-10-17 22:40             ` Ben Myers
  0 siblings, 1 reply; 43+ messages in thread
From: Dave Chinner @ 2012-10-16  1:39 UTC (permalink / raw)
  To: Ben Myers; +Cc: Brian Foster, xfs

On Tue, Oct 16, 2012 at 10:49:02AM +1100, Dave Chinner wrote:
> On Mon, Oct 15, 2012 at 05:46:26PM -0500, Ben Myers wrote:
> > Hey Brian,
> > 
> > On Thu, Oct 11, 2012 at 06:35:14PM -0400, Brian Foster wrote:
> > > On 10/11/2012 10:13 AM, Ben Myers wrote:
> > > > Hey Brian,
> > > > 
> > > > On Fri, Oct 05, 2012 at 10:17:12AM -0400, Brian Foster wrote:
> > > >> The XFS_IOC_FREE_EOFBLOCKS ioctl allows users to invoke an EOFBLOCKS
> > > >> scan. The xfs_eofblocks structure is defined to support the command
> > > >> parameters (scan mode).
> > > > 
> > > > It would help to have an xfstest to exercise this ioctl to pull in with this
> > > > series.  Do you have any code that could be wrangled into a test case?
> > > > 
> > > 
> > > Yes, makes sense. I have some very basic test code I could put somewhere
> > > to invoke the ioctl(). One of the questions I've been meaning to ask is
> > > whether it would be relevant for that code to live in a common tool,
> > > such as adding a new command to xfs_io. Then perhaps create an xfstests
> > > test using that. Thoughts?
> > 
> > IMO you are right on the mark.  xfs_io is a great place for this.
> > 
> > > FYI, I have a few other things on my plate at the moment so
> > > unfortunately it will be a bit before I can get back to XFS work... But
> > > I'm fine with the set pending until I can come up with some test
> > > coverage if that is preferable, of course.
> > 
> > I do think it is preferable to have a test case go in with the code where
> > possible.  Since you don't mind waiting a bit, that seems to be the way to go.
> > The other option could be to look for a volunteer to work on the test.  ;)
> 
> FWIW, given the background cleanup code can be trivially verified to
> work (open, apend, close, repeat, wait 5 minutes) and is the
> functionality that is needed in mainline, having something to test
> the ioctls should not stop the patchset from being merged.

i.e.:

$ for i in `seq 0 512`; do
> xfs_io -f -c "pwrite $((i * 4096)) 4096" /mnt/scratch/foo
> done
$ stat -c %b /mnt/scratch/foo
8192
$ sync; stat -c %b /mnt/scratch/foo
8192
$ sleep 30; stat -c %b /mnt/scratch/foo
8192
$ sleep 300; stat -c %b /mnt/scratch/foo
4104

It works. ;)

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-16  1:39           ` Dave Chinner
@ 2012-10-17 22:40             ` Ben Myers
  2012-10-18 12:16               ` Brian Foster
  0 siblings, 1 reply; 43+ messages in thread
From: Ben Myers @ 2012-10-17 22:40 UTC (permalink / raw)
  To: Dave Chinner, Brian Foster; +Cc: xfs

Hi Fellas,

On Tue, Oct 16, 2012 at 12:39:01PM +1100, Dave Chinner wrote:
> On Tue, Oct 16, 2012 at 10:49:02AM +1100, Dave Chinner wrote:
> > On Mon, Oct 15, 2012 at 05:46:26PM -0500, Ben Myers wrote:
> > > Hey Brian,
> > > 
> > > On Thu, Oct 11, 2012 at 06:35:14PM -0400, Brian Foster wrote:
> > > > On 10/11/2012 10:13 AM, Ben Myers wrote:
> > > > > Hey Brian,
> > > > > 
> > > > > On Fri, Oct 05, 2012 at 10:17:12AM -0400, Brian Foster wrote:
> > > > >> The XFS_IOC_FREE_EOFBLOCKS ioctl allows users to invoke an EOFBLOCKS
> > > > >> scan. The xfs_eofblocks structure is defined to support the command
> > > > >> parameters (scan mode).
> > > > > 
> > > > > It would help to have an xfstest to exercise this ioctl to pull in with this
> > > > > series.  Do you have any code that could be wrangled into a test case?
> > > > > 
> > > > 
> > > > Yes, makes sense. I have some very basic test code I could put somewhere
> > > > to invoke the ioctl(). One of the questions I've been meaning to ask is
> > > > whether it would be relevant for that code to live in a common tool,
> > > > such as adding a new command to xfs_io. Then perhaps create an xfstests
> > > > test using that. Thoughts?
> > > 
> > > IMO you are right on the mark.  xfs_io is a great place for this.
> > > 
> > > > FYI, I have a few other things on my plate at the moment so
> > > > unfortunately it will be a bit before I can get back to XFS work... But
> > > > I'm fine with the set pending until I can come up with some test
> > > > coverage if that is preferable, of course.
> > > 
> > > I do think it is preferable to have a test case go in with the code where
> > > possible.  Since you don't mind waiting a bit, that seems to be the way to go.
> > > The other option could be to look for a volunteer to work on the test.  ;)
> > 
> > FWIW, given the background cleanup code can be trivially verified to
> > work (open, apend, close, repeat, wait 5 minutes) and is the
> > functionality that is needed in mainline, having something to test
> > the ioctls should not stop the patchset from being merged.

Can we be assured that we'll get an xfstest for it eventually?

I think we can pull this in if Brian is willing post his test code.  Initially
it needn't be posted as an xfstest, that's fine.  As it stands today it appears
that Brian is the only one to ever use the ioctl and there is no way for anyone
else to test it.  Not an ideal situation.

I think this is a reasonable request.  Usually it's Christoph who asks for a
test case.  ;)

> i.e.:
> 
> $ for i in `seq 0 512`; do
> > xfs_io -f -c "pwrite $((i * 4096)) 4096" /mnt/scratch/foo
> > done
> $ stat -c %b /mnt/scratch/foo
> 8192
> $ sync; stat -c %b /mnt/scratch/foo
> 8192
> $ sleep 30; stat -c %b /mnt/scratch/foo
> 8192
> $ sleep 300; stat -c %b /mnt/scratch/foo
> 4104
> 
> It works. ;)

Nice!

Regards,
Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-17 22:40             ` Ben Myers
@ 2012-10-18 12:16               ` Brian Foster
  2012-10-18 15:46                 ` Ben Myers
  2012-10-22  7:34                 ` Dave Chinner
  0 siblings, 2 replies; 43+ messages in thread
From: Brian Foster @ 2012-10-18 12:16 UTC (permalink / raw)
  To: Ben Myers; +Cc: xfs

[-- Attachment #1: Type: text/plain, Size: 2192 bytes --]

On 10/17/2012 06:40 PM, Ben Myers wrote:
> Hi Fellas,
> 
> On Tue, Oct 16, 2012 at 12:39:01PM +1100, Dave Chinner wrote:
>> On Tue, Oct 16, 2012 at 10:49:02AM +1100, Dave Chinner wrote:
>>> On Mon, Oct 15, 2012 at 05:46:26PM -0500, Ben Myers wrote:
>>>> Hey Brian,
>>>>
...
>>>> I do think it is preferable to have a test case go in with the code where
>>>> possible.  Since you don't mind waiting a bit, that seems to be the way to go.
>>>> The other option could be to look for a volunteer to work on the test.  ;)
>>>
>>> FWIW, given the background cleanup code can be trivially verified to
>>> work (open, apend, close, repeat, wait 5 minutes) and is the
>>> functionality that is needed in mainline, having something to test
>>> the ioctls should not stop the patchset from being merged.
> 
> Can we be assured that we'll get an xfstest for it eventually?
> 

Absolutely. Getting a command into xfs_io to support such a test is now
the top of my todo list with regard to XFS. :)

> I think we can pull this in if Brian is willing post his test code.  Initially
> it needn't be posted as an xfstest, that's fine.  As it stands today it appears
> that Brian is the only one to ever use the ioctl and there is no way for anyone
> else to test it.  Not an ideal situation.
> 

The impetus for this work was some prototype quota work for the gluster
distributed filesystem. Beyond testing via that, I just had a very
simple/stupid program with the command and data structure bits
copy/pasted in to open a file and invoke an xfsctl() with hardcoded
parameters. I don't consider it post-worthy or very useful, but attached
nonetheless since I have a couple things to deal with before I get a
chance to play with xfs_io.

> I think this is a reasonable request.  Usually it's Christoph who asks for a
> test case.  ;)
> 

Agree.

Brian

>> i.e.:
>>
>> $ for i in `seq 0 512`; do
>>> xfs_io -f -c "pwrite $((i * 4096)) 4096" /mnt/scratch/foo
>>> done
>> $ stat -c %b /mnt/scratch/foo
>> 8192
>> $ sync; stat -c %b /mnt/scratch/foo
>> 8192
>> $ sleep 30; stat -c %b /mnt/scratch/foo
>> 8192
>> $ sleep 300; stat -c %b /mnt/scratch/foo
>> 4104
>>
>> It works. ;)
> 
> Nice!
> 
> Regards,
> Ben
> 


[-- Attachment #2: free_eofblocks.c --]
[-- Type: text/x-csrc, Size: 1483 bytes --]

#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <xfs/xfs.h>

#include <sys/quota.h>
#ifndef PRJQUOTA
#define PRJQUOTA 2
#endif

/*
 * Speculative preallocation trimming.
 */
#define XFS_EOFBLOCKS_VERSION           1
struct xfs_eofblocks {
        __s32           eof_version;
        __u32           eof_flags;
        __u32           eof_q_id;
        __u32           eof_q_type;
        __u32           eof_min_file_size;
        unsigned char   pad[12];
};

/* eof_flags values */
#define XFS_EOF_FLAGS_SYNC              0x01    /* sync/wait mode scan */
#define XFS_EOF_FLAGS_QUOTA             0x02    /* filter by quota id */
#define XFS_EOF_FLAGS_MINFILESIZE       0x04    /* filter by min file size */

#define XFS_IOC_FREE_EOFBLOCKS  _IOR ('X', 58, struct xfs_eofblocks)

int main(int argc, char *argv[])
{
        int ret, fd;
	struct xfs_eofblocks eofb;

        fd = open(argv[1], O_RDONLY);
	if (fd < 0) {
		perror("open");
		return -1;
	}

	memset(&eofb, 0, sizeof(struct xfs_eofblocks));

	eofb.eof_version = XFS_EOFBLOCKS_VERSION;
	eofb.eof_flags |= XFS_EOF_FLAGS_SYNC;
	//eofb.eof_flags |= XFS_EOF_FLAGS_QUOTA;
	//eofb.eof_q_id = 42;
	//eofb.eof_q_type = PRJQUOTA;
	eofb.eof_flags |= XFS_EOF_FLAGS_MINFILESIZE;
	eofb.eof_min_file_size = 1024 * 1024 * 101;
        ret = xfsctl(NULL, fd, XFS_IOC_FREE_EOFBLOCKS, &eofb);
        if (ret < 0) 
		perror("xfsctl");

        if (close(fd) < 0)
		perror("close");

	return 0;
}


[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-18 12:16               ` Brian Foster
@ 2012-10-18 15:46                 ` Ben Myers
  2012-10-18 16:23                   ` Brian Foster
  2012-10-22  7:34                 ` Dave Chinner
  1 sibling, 1 reply; 43+ messages in thread
From: Ben Myers @ 2012-10-18 15:46 UTC (permalink / raw)
  To: Brian Foster; +Cc: tinguely, xfs

Hey Brian,

On Thu, Oct 18, 2012 at 08:16:57AM -0400, Brian Foster wrote:
> On 10/17/2012 06:40 PM, Ben Myers wrote:
> > On Tue, Oct 16, 2012 at 12:39:01PM +1100, Dave Chinner wrote:
> >> On Tue, Oct 16, 2012 at 10:49:02AM +1100, Dave Chinner wrote:
> >>> On Mon, Oct 15, 2012 at 05:46:26PM -0500, Ben Myers wrote:
> >>>> Hey Brian,
> >>>>
> ...
> >>>> I do think it is preferable to have a test case go in with the code where
> >>>> possible.  Since you don't mind waiting a bit, that seems to be the way to go.
> >>>> The other option could be to look for a volunteer to work on the test.  ;)
> >>>
> >>> FWIW, given the background cleanup code can be trivially verified to
> >>> work (open, apend, close, repeat, wait 5 minutes) and is the
> >>> functionality that is needed in mainline, having something to test
> >>> the ioctls should not stop the patchset from being merged.
> > 
> > Can we be assured that we'll get an xfstest for it eventually?
> > 
> 
> Absolutely. Getting a command into xfs_io to support such a test is now
> the top of my todo list with regard to XFS. :)

Excellent!

> > I think we can pull this in if Brian is willing post his test code.  Initially
> > it needn't be posted as an xfstest, that's fine.  As it stands today it appears
> > that Brian is the only one to ever use the ioctl and there is no way for anyone
> > else to test it.  Not an ideal situation.
> > 
> 
> The impetus for this work was some prototype quota work for the gluster
> distributed filesystem.

Now I'm curious where you're coming from.  Is gluster your primary project?

> Beyond testing via that, I just had a very
> simple/stupid program with the command and data structure bits
> copy/pasted in to open a file and invoke an xfsctl() with hardcoded
> parameters. I don't consider it post-worthy or very useful, but attached
> nonetheless since I have a couple things to deal with before I get a
> chance to play with xfs_io.

Thanks for posting it anyway.  That's enough for an interested party to do a
little fooling.  I believe Mark is finishing up a review of this series, so
we'll get it in soon.

Regards,
	Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-18 15:46                 ` Ben Myers
@ 2012-10-18 16:23                   ` Brian Foster
  0 siblings, 0 replies; 43+ messages in thread
From: Brian Foster @ 2012-10-18 16:23 UTC (permalink / raw)
  To: Ben Myers; +Cc: tinguely, xfs

On 10/18/2012 11:46 AM, Ben Myers wrote:
> Hey Brian,
> 
> On Thu, Oct 18, 2012 at 08:16:57AM -0400, Brian Foster wrote:
>> On 10/17/2012 06:40 PM, Ben Myers wrote:
>>> On Tue, Oct 16, 2012 at 12:39:01PM +1100, Dave Chinner wrote:
>>>> On Tue, Oct 16, 2012 at 10:49:02AM +1100, Dave Chinner wrote:
>>>>> On Mon, Oct 15, 2012 at 05:46:26PM -0500, Ben Myers wrote:
>>>>>> Hey Brian,
>>>>>>
>> ...
>>>>>> I do think it is preferable to have a test case go in with the code where
>>>>>> possible.  Since you don't mind waiting a bit, that seems to be the way to go.
>>>>>> The other option could be to look for a volunteer to work on the test.  ;)
>>>>>
>>>>> FWIW, given the background cleanup code can be trivially verified to
>>>>> work (open, apend, close, repeat, wait 5 minutes) and is the
>>>>> functionality that is needed in mainline, having something to test
>>>>> the ioctls should not stop the patchset from being merged.
>>>
>>> Can we be assured that we'll get an xfstest for it eventually?
>>>
>>
>> Absolutely. Getting a command into xfs_io to support such a test is now
>> the top of my todo list with regard to XFS. :)
> 
> Excellent!
> 
>>> I think we can pull this in if Brian is willing post his test code.  Initially
>>> it needn't be posted as an xfstest, that's fine.  As it stands today it appears
>>> that Brian is the only one to ever use the ioctl and there is no way for anyone
>>> else to test it.  Not an ideal situation.
>>>
>>
>> The impetus for this work was some prototype quota work for the gluster
>> distributed filesystem.
> 
> Now I'm curious where you're coming from.  Is gluster your primary project?
> 

gluster is the primary component of our appliance. We recommend running
our "bricks" on top of XFS, so one of my goals has been to get more
involved with XFS development as well. :) This was a nice little project
for me to break some ground on (actually I have some associated XFS
quota prealloc throttling changes in the works as well, but I'll
prioritize the testing work here first). ;)

>> Beyond testing via that, I just had a very
>> simple/stupid program with the command and data structure bits
>> copy/pasted in to open a file and invoke an xfsctl() with hardcoded
>> parameters. I don't consider it post-worthy or very useful, but attached
>> nonetheless since I have a couple things to deal with before I get a
>> chance to play with xfs_io.
> 
> Thanks for posting it anyway.  That's enough for an interested party to do a
> little fooling.  I believe Mark is finishing up a review of this series, so
> we'll get it in soon.
> 

Sounds good. Thanks.

Brian

> Regards,
> 	Ben
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 00/10] speculative preallocation inode tracking
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
                   ` (9 preceding siblings ...)
  2012-10-05 14:17 ` [PATCH v5 10/10] xfs: add background scanning to clear eofblocks inodes Brian Foster
@ 2012-10-19 21:02 ` Mark Tinguely
  2012-10-21 14:00   ` Brian Foster
  2012-10-23 19:10 ` Ben Myers
  11 siblings, 1 reply; 43+ messages in thread
From: Mark Tinguely @ 2012-10-19 21:02 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On 10/05/12 09:17, Brian Foster wrote:
> Hi all,
>
> This is v3 of the speculative preallocation inode tracking patchset. This
> functionality tracks inodes with post-EOF speculative preallocation for the
> purpose of background and on-demand trimming.
>
> Background scanning occurs on a longish interval (5 minutes by default) and in
> a best-effort mode (i.e., inodes are skipped due to lock contention or dirty
> cache). The intent is to clear up post-EOF blocks on inodes that might have
> allocations hanging around due to open-write-close sequences (NFS).
>
> On demand scanning is provided via a new ioctl and supports various parameters
> such as scan mode, filtering by quota id and minimum file size. A pending use
> case for on demand scanning is for accurate quota accounting via the gluster
> scale out filesystem (i.e., to free up preallocated space when near a usage
> limit).
>
> Brian

The series looks great.

I am just curious, what is the reason for the padding in the 
xfs_eofblocks structure?

Reviewed-by: Mark Tinguely <tinguely@sgi.com>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 00/10] speculative preallocation inode tracking
  2012-10-19 21:02 ` [PATCH v5 00/10] speculative preallocation inode tracking Mark Tinguely
@ 2012-10-21 14:00   ` Brian Foster
  2012-10-21 17:53     ` Mark Tinguely
  0 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-21 14:00 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: xfs

On 10/19/2012 05:02 PM, Mark Tinguely wrote:
> On 10/05/12 09:17, Brian Foster wrote:
>> Hi all,
>>
>> This is v3 of the speculative preallocation inode tracking patchset. This
>> functionality tracks inodes with post-EOF speculative preallocation
>> for the
>> purpose of background and on-demand trimming.
>>
>> Background scanning occurs on a longish interval (5 minutes by
>> default) and in
>> a best-effort mode (i.e., inodes are skipped due to lock contention or
>> dirty
>> cache). The intent is to clear up post-EOF blocks on inodes that might
>> have
>> allocations hanging around due to open-write-close sequences (NFS).
>>
>> On demand scanning is provided via a new ioctl and supports various
>> parameters
>> such as scan mode, filtering by quota id and minimum file size. A
>> pending use
>> case for on demand scanning is for accurate quota accounting via the
>> gluster
>> scale out filesystem (i.e., to free up preallocated space when near a
>> usage
>> limit).
>>
>> Brian
> 
> The series looks great.
> 

Hi Mark,

Thanks for the review.

> I am just curious, what is the reason for the padding in the
> xfs_eofblocks structure?
> 

I added the padding in response to review on an early revision of the set:

http://oss.sgi.com/archives/xfs/2012-09/msg00024.html

The purpose is to allow adding fields to the control structure down the
road without breaking existing binaries.

Brian

> Reviewed-by: Mark Tinguely <tinguely@sgi.com>

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 00/10] speculative preallocation inode tracking
  2012-10-21 14:00   ` Brian Foster
@ 2012-10-21 17:53     ` Mark Tinguely
  2012-10-21 20:31       ` Mark Tinguely
  2012-10-21 22:28       ` Dave Chinner
  0 siblings, 2 replies; 43+ messages in thread
From: Mark Tinguely @ 2012-10-21 17:53 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On 10/21/12 09:00, Brian Foster wrote:
> On 10/19/2012 05:02 PM, Mark Tinguely wrote:
>> On 10/05/12 09:17, Brian Foster wrote:
>>> Hi all,
>>>
>>> This is v3 of the speculative preallocation inode tracking patchset. This
>>> functionality tracks inodes with post-EOF speculative preallocation
>>> for the
>>> purpose of background and on-demand trimming.
>>>
>>> Background scanning occurs on a longish interval (5 minutes by
>>> default) and in
>>> a best-effort mode (i.e., inodes are skipped due to lock contention or
>>> dirty
>>> cache). The intent is to clear up post-EOF blocks on inodes that might
>>> have
>>> allocations hanging around due to open-write-close sequences (NFS).
>>>
>>> On demand scanning is provided via a new ioctl and supports various
>>> parameters
>>> such as scan mode, filtering by quota id and minimum file size. A
>>> pending use
>>> case for on demand scanning is for accurate quota accounting via the
>>> gluster
>>> scale out filesystem (i.e., to free up preallocated space when near a
>>> usage
>>> limit).
>>>
>>> Brian
>>
>> The series looks great.
>>
>
> Hi Mark,
>
> Thanks for the review.
>
>> I am just curious, what is the reason for the padding in the
>> xfs_eofblocks structure?
>>
>
> I added the padding in response to review on an early revision of the set:
>
> http://oss.sgi.com/archives/xfs/2012-09/msg00024.html
>
> The purpose is to allow adding fields to the control structure down the
> road without breaking existing binaries.
>
> Brian
>
>> Reviewed-by: Mark Tinguely<tinguely@sgi.com>
>

Thank-you for the information.

I would think that changing the number of arguments would also
involving changing the version number. The kernel should know
that version 1 copies in 16 bytes, version 2 copies in 16+t bytes,
version n copies in 16+n bytes...

Not at all a big deal, (16 used and 12 padding = ) 28 bytes was an
odd number and it got my curiosity up. I suspected it was part a plan
of yours and Pinky to take over the world. :)

Nice feature. I jury-rigged a ioctl to test and it works great. I look
forwarded to seeing this in xfs_io.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 00/10] speculative preallocation inode tracking
  2012-10-21 17:53     ` Mark Tinguely
@ 2012-10-21 20:31       ` Mark Tinguely
  2012-10-21 22:28       ` Dave Chinner
  1 sibling, 0 replies; 43+ messages in thread
From: Mark Tinguely @ 2012-10-21 20:31 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On 10/21/12 12:53, Mark Tinguely wrote:

>
> Not at all a big deal, (16 used and 12 padding = ) 28 bytes was an
> odd number and it got my curiosity up. I suspected it was part a plan
> of yours and Pinky to take over the world. :)
>

Ha-ha. never-mind. I am foiled again by my inability to count past 4, or 
stupidity of counting using diff files.

--Mark.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 00/10] speculative preallocation inode tracking
  2012-10-21 17:53     ` Mark Tinguely
  2012-10-21 20:31       ` Mark Tinguely
@ 2012-10-21 22:28       ` Dave Chinner
  1 sibling, 0 replies; 43+ messages in thread
From: Dave Chinner @ 2012-10-21 22:28 UTC (permalink / raw)
  To: Mark Tinguely; +Cc: Brian Foster, xfs

On Sun, Oct 21, 2012 at 12:53:19PM -0500, Mark Tinguely wrote:
> On 10/21/12 09:00, Brian Foster wrote:
> >On 10/19/2012 05:02 PM, Mark Tinguely wrote:
> >>I am just curious, what is the reason for the padding in the
> >>xfs_eofblocks structure?
> >>
> >
> >I added the padding in response to review on an early revision of the set:
> >
> >http://oss.sgi.com/archives/xfs/2012-09/msg00024.html
> >
> >The purpose is to allow adding fields to the control structure down the
> >road without breaking existing binaries.
> 
> Thank-you for the information.
> 
> I would think that changing the number of arguments would also
> involving changing the version number.

Yes, it usually does.

> The kernel should know
> that version 1 copies in 16 bytes, version 2 copies in 16+t bytes,
> version n copies in 16+n bytes...

Gets messy, pretty quickly.  Versioning and padding user facing
structures makes life a lot easy when it comes to extending
interfaces. 15-20 years of support for an ioctl is a long time, and
that's the sort of time frame we need to think about. How to make it
easy to maintain and extend over a long period of time. We should
always version and pad user facing ioctl structures for this reason.

Further, assuming that userspace knows exactly the right size for a
given feature is problematic.  If there's only one structure type
for userspace to use, then it's likely they'll get it right. If
there's a different structure for every version of the ioctl, then
it's likely they'll get it wrong.

And we get validation wrong in the kernel, too. The kernel must
validate the size of the structure as being correct given the
version number, and that gets hard to validate and easy to get wrong
when you have a different structure for every version that exists.

See, for example, struct xfs_fsop_geom_t, and XFS_IOC_FSGEOMETRY_V1/
XFS_IOC_FSGEOMETRY. The originaly was an unversioned structure with
no padding, and when the V2 log format came along, the structure had
to be extended and a new ioctl added to support it. Internally they
both used the same struture, but the copy-in/out were different and
initialisation was different. The result was leaking unitialised
data to userspace because we didn't get it right.  (c4d0c3b "xfs:
prevent leaking uninitialized stack memory in FSGEOMETRY_V1")

At least the new xfs_fsops_geom structure has a version number in it
so we don't have to add a new ioctl to extend it further. However,
it still doesn't have any padding so any further extensions will
have to be very careful to avoid structure size mismatches. This
could have been avoided is some padding was added as the time the
version number was added....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-18 12:16               ` Brian Foster
  2012-10-18 15:46                 ` Ben Myers
@ 2012-10-22  7:34                 ` Dave Chinner
  2012-10-22 13:23                   ` Brian Foster
  1 sibling, 1 reply; 43+ messages in thread
From: Dave Chinner @ 2012-10-22  7:34 UTC (permalink / raw)
  To: Brian Foster; +Cc: Ben Myers, xfs

On Thu, Oct 18, 2012 at 08:16:57AM -0400, Brian Foster wrote:
> On 10/17/2012 06:40 PM, Ben Myers wrote:
> >>> FWIW, given the background cleanup code can be trivially verified to
> >>> work (open, apend, close, repeat, wait 5 minutes) and is the
> >>> functionality that is needed in mainline, having something to test
> >>> the ioctls should not stop the patchset from being merged.
> > 
> > Can we be assured that we'll get an xfstest for it eventually?
> 
> Absolutely. Getting a command into xfs_io to support such a test is now
> the top of my todo list with regard to XFS. :)

Here's a patch to the new xfs_spaceman program I'm writing that adds
control for these ioctls.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

spaceman: add new speculative prealloc control

From: Dave Chinner <dchinner@redhat.com>

Add an control interface for purging speculative
preallocation via the new ioctls.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 spaceman/Makefile   |    2 +-
 spaceman/init.c     |    1 +
 spaceman/prealloc.c |  165 +++++++++++++++++++++++++++++++++++++++++++++++++++
 spaceman/space.h    |    1 +
 4 files changed, 168 insertions(+), 1 deletion(-)

diff --git a/spaceman/Makefile b/spaceman/Makefile
index 612d36b..b651904 100644
--- a/spaceman/Makefile
+++ b/spaceman/Makefile
@@ -8,7 +8,7 @@ include $(TOPDIR)/include/builddefs
 LTCOMMAND = xfs_spaceman
 HFILES = init.h space.h
 CFILES = init.c \
-	file.c freesp.c
+	file.c freesp.c prealloc.c
 
 LLDLIBS = $(LIBXCMD)
 LTDEPENDENCIES = $(LIBXCMD)
diff --git a/spaceman/init.c b/spaceman/init.c
index 108dcd7..aa53be4 100644
--- a/spaceman/init.c
+++ b/spaceman/init.c
@@ -40,6 +40,7 @@ init_commands(void)
 	file_init();
 	freesp_init();
 	help_init();
+	prealloc_init();
 	quit_init();
 }
 
diff --git a/spaceman/prealloc.c b/spaceman/prealloc.c
new file mode 100644
index 0000000..8af30a6
--- /dev/null
+++ b/spaceman/prealloc.c
@@ -0,0 +1,165 @@
+/*
+ * Copyright (c) 2000-2001,2005 Silicon Graphics, Inc.
+ * Copyright (c) 2012 Red Hat, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation,
+ * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+ */
+
+#include <xfs/xfs.h>
+#include <xfs/xfs_types.h>
+#include <xfs/command.h>
+#include <linux/dqblk_xfs.h>
+#include "init.h"
+#include "space.h"
+
+#ifndef XFS_IOC_FREE_EOFBLOCKS
+#define XFS_IOC_FREE_EOFBLOCKS _IOR ('X', 58, struct xfs_eofblocks)
+
+#define XFS_EOFBLOCKS_VERSION		1
+struct xfs_eofblocks {
+	__u32		eof_version;
+	__u32		eof_flags;
+	__u32		eof_q_id;
+	__u32		eof_q_type;
+	__u32		eof_min_file_size;
+	unsigned char	pad[12];
+};
+
+/* eof_flags values */
+#define XFS_EOF_FLAGS_SYNC		0x01	/* sync/wait mode scan */
+#define XFS_EOF_FLAGS_QUOTA		0x02	/* filter by quota id */
+#define XFS_EOF_FLAGS_MINFILESIZE	0x04	/* filter by min file size */
+#endif
+
+int gflag;
+int uflag;
+int pflag;
+int sflag;
+int qid;
+int minlen;
+
+static cmdinfo_t prealloc_cmd;
+
+/*
+ * Report on freespace usage in xfs filesystem.
+ */
+static int
+prealloc_f(
+	int	argc,
+	char	**argv)
+{
+	int	c;
+	struct xfs_eofblocks eofb = {0};
+
+	uflag = 0;
+	gflag = 0;
+	pflag = 0;
+	sflag = 0;
+	minlen = 0;
+	qid = 0;
+
+	while ((c = getopt(argc, argv, "g:m:p:su:")) != EOF) {
+		switch (c) {
+		case 'g':
+			if (uflag || pflag)
+				return command_usage(&prealloc_cmd);
+			gflag = 1;
+			qid = atoi(optarg);
+			break;
+		case 'u':
+			if (gflag || pflag)
+				return command_usage(&prealloc_cmd);
+			uflag = 1;
+			qid = atoi(optarg);
+			break;
+		case 'p':
+			if (uflag || gflag)
+				return command_usage(&prealloc_cmd);
+			pflag = 1;
+			qid = atoi(optarg);
+			break;
+		case 's':
+			sflag = 1;
+			break;
+		case 'm':
+			minlen = atoi(optarg);
+			break;
+		case '?':
+			return command_usage(&prealloc_cmd);
+		}
+	}
+	if (optind != argc)
+		return command_usage(&prealloc_cmd);
+
+	eofb.eof_version = XFS_EOFBLOCKS_VERSION;
+	if (sflag)
+		eofb.eof_flags |= XFS_EOF_FLAGS_SYNC;
+
+	if (minlen) {
+		eofb.eof_flags |= XFS_EOF_FLAGS_MINFILESIZE;
+		eofb.eof_min_file_size = minlen;
+	}
+	if (uflag || gflag || pflag) {
+		eofb.eof_flags |= XFS_EOF_FLAGS_QUOTA;
+		eofb.eof_q_id = qid;
+		if (uflag)
+			eofb.eof_q_type = XQM_USRQUOTA;
+		else if (gflag)
+			eofb.eof_q_type = XQM_GRPQUOTA;
+		else if (pflag)
+			eofb.eof_q_type = XQM_PRJQUOTA;
+	}
+
+	if (xfsctl(file->name, file->fd, XFS_IOC_FREE_EOFBLOCKS, &eofb) < 0) {
+		fprintf(stderr, _("%s: XFS_IOC_FREE_EOFBLOCKS on %s: %s\n"),
+			progname, file->name, strerror(errno));
+	}
+	return 0;
+}
+
+static void
+prealloc_help(void)
+{
+	printf(_(
+"\n"
+"Control speculative preallocation\n"
+"\n"
+"Options: [-s] [-ugp id] [-m minlen]\n"
+"\n"
+" -s -- synchronous flush - wait for flush to complete\n"
+" -u id -- remove prealloc on files matching user quota id <id>\n"
+" -g id -- remove prealloc on files matching group quota id <id>\n"
+" -p id -- remove prealloc on files matching project quota id <id>\n"
+" -m minlen -- only consider files larger than <minlen>\n"
+"\n"));
+
+}
+
+void
+prealloc_init(void)
+{
+	prealloc_cmd.name = "prealloc";
+	prealloc_cmd.altname = "prealloc";
+	prealloc_cmd.cfunc = prealloc_f;
+	prealloc_cmd.argmin = 1;
+	prealloc_cmd.argmax = -1;
+	prealloc_cmd.args = "[-s] [-ugp id] [-m minlen]\n";
+	prealloc_cmd.flags = CMD_FLAG_GLOBAL;
+	prealloc_cmd.oneline = _("Control specualtive preallocation");
+	prealloc_cmd.help = prealloc_help;
+
+	add_command(&prealloc_cmd);
+}
+
diff --git a/spaceman/space.h b/spaceman/space.h
index c6a63fe..33789a3 100644
--- a/spaceman/space.h
+++ b/spaceman/space.h
@@ -35,3 +35,4 @@ extern void	file_init(void);
 extern void	help_init(void);
 extern void	quit_init(void);
 extern void	freesp_init(void);
+extern void	prealloc_init(void);

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-22  7:34                 ` Dave Chinner
@ 2012-10-22 13:23                   ` Brian Foster
  2012-10-22 22:22                     ` Dave Chinner
  0 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-22 13:23 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Ben Myers, xfs

On 10/22/2012 03:34 AM, Dave Chinner wrote:
> On Thu, Oct 18, 2012 at 08:16:57AM -0400, Brian Foster wrote:
>> On 10/17/2012 06:40 PM, Ben Myers wrote:
>>>>> FWIW, given the background cleanup code can be trivially verified to
>>>>> work (open, apend, close, repeat, wait 5 minutes) and is the
>>>>> functionality that is needed in mainline, having something to test
>>>>> the ioctls should not stop the patchset from being merged.
>>>
>>> Can we be assured that we'll get an xfstest for it eventually?
>>
>> Absolutely. Getting a command into xfs_io to support such a test is now
>> the top of my todo list with regard to XFS. :)
> 
> Here's a patch to the new xfs_spaceman program I'm writing that adds
> control for these ioctls.
> 

Very cool, thanks. Catchy name for the tool as well, btw ;).

For some reason my mailer is stripping out the patch, but my only
comment is with regard to minlen. Shouldn't that variable be handled as
an unsigned? Now that I think of it, that makes me wonder if I should
make that a 64-bit unsigned in xfs_eofblocks..?

Brian

> Cheers,
> 
> Dave.
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-22 13:23                   ` Brian Foster
@ 2012-10-22 22:22                     ` Dave Chinner
  0 siblings, 0 replies; 43+ messages in thread
From: Dave Chinner @ 2012-10-22 22:22 UTC (permalink / raw)
  To: Brian Foster; +Cc: Ben Myers, xfs

On Mon, Oct 22, 2012 at 09:23:11AM -0400, Brian Foster wrote:
> On 10/22/2012 03:34 AM, Dave Chinner wrote:
> > On Thu, Oct 18, 2012 at 08:16:57AM -0400, Brian Foster wrote:
> >> On 10/17/2012 06:40 PM, Ben Myers wrote:
> >>>>> FWIW, given the background cleanup code can be trivially verified to
> >>>>> work (open, apend, close, repeat, wait 5 minutes) and is the
> >>>>> functionality that is needed in mainline, having something to test
> >>>>> the ioctls should not stop the patchset from being merged.
> >>>
> >>> Can we be assured that we'll get an xfstest for it eventually?
> >>
> >> Absolutely. Getting a command into xfs_io to support such a test is now
> >> the top of my todo list with regard to XFS. :)
> > 
> > Here's a patch to the new xfs_spaceman program I'm writing that adds
> > control for these ioctls.
> > 
> 
> Very cool, thanks. Catchy name for the tool as well, btw ;).
> 
> For some reason my mailer is stripping out the patch,

Probably because I simply added it inline below my sig. The mailer
is probably dropping everything below the /^-- $/ line that marks
the sig...
>
> but my only
> comment is with regard to minlen. Shouldn't that variable be handled as
> an unsigned?

Probably, but a ssize_t would be better, and ....

> Now that I think of it, that makes me wonder if I should
> make that a 64-bit unsigned in xfs_eofblocks..?

... yes, a __u64 would be better.

I've got a couple more comments now I've actually used and tested
it, so I'll do that today....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 03/10] xfs: create helper to check whether to free eofblocks on inode
  2012-10-05 14:17 ` [PATCH v5 03/10] xfs: create helper to check whether to free eofblocks on inode Brian Foster
@ 2012-10-23  0:58   ` Dave Chinner
  0 siblings, 0 replies; 43+ messages in thread
From: Dave Chinner @ 2012-10-23  0:58 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Fri, Oct 05, 2012 at 10:17:09AM -0400, Brian Foster wrote:
> This check is used in multiple places to determine whether we
> should check for (and potentially free) post EOF blocks on an
> inode. Add a helper to consolidate the check.
> 
> Note that when we remove an inode from the cache (xfs_inactive()),
> we are required to trim post-EOF blocks even if the inode is marked
> preallocated or append-only to maintain correct space accounting.
> The 'force' parameter to xfs_can_free_eofblocks() specifies whether
> we should ignore the prealloc/append-only status of the inode.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks good. I'm glad this little mess is going away :)

Reviewed-by: Dave Chinner <dchinner@redhat.com>

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 05/10] xfs: create function to scan and clear EOFBLOCKS inodes
  2012-10-05 14:17 ` [PATCH v5 05/10] xfs: create function to scan and clear EOFBLOCKS inodes Brian Foster
@ 2012-10-23  1:01   ` Dave Chinner
  0 siblings, 0 replies; 43+ messages in thread
From: Dave Chinner @ 2012-10-23  1:01 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Fri, Oct 05, 2012 at 10:17:11AM -0400, Brian Foster wrote:
> xfs_inodes_free_eofblocks() implements scanning functionality for
> EOFBLOCKS inodes. It uses the AG iterator to walk the tagged inodes
> and free post-EOF blocks via the xfs_inode_free_eofblocks() execute
> function. The scan can be invoked in best-effort mode or wait
> (force) mode.
> 
> A best-effort scan (default) handles all inodes that do not have a
> dirty cache and we successfully acquire the io lock via trylock. In
> wait mode, we continue to cycle through an AG until all inodes are
> handled.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks fine.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-05 14:17 ` [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl Brian Foster
  2012-10-11 14:13   ` Ben Myers
@ 2012-10-23  1:31   ` Dave Chinner
  2012-10-24 16:16     ` Brian Foster
  1 sibling, 1 reply; 43+ messages in thread
From: Dave Chinner @ 2012-10-23  1:31 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Fri, Oct 05, 2012 at 10:17:12AM -0400, Brian Foster wrote:
> The XFS_IOC_FREE_EOFBLOCKS ioctl allows users to invoke an EOFBLOCKS
> scan. The xfs_eofblocks structure is defined to support the command
> parameters (scan mode).
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> ---
>  fs/xfs/xfs_fs.h     |   15 +++++++++++++++
>  fs/xfs/xfs_icache.c |    5 +++--
>  fs/xfs/xfs_icache.h |    2 +-
>  fs/xfs/xfs_ioctl.c  |   16 ++++++++++++++++
>  4 files changed, 35 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/xfs_fs.h b/fs/xfs/xfs_fs.h
> index c13fed8..5f48b3e 100644
> --- a/fs/xfs/xfs_fs.h
> +++ b/fs/xfs/xfs_fs.h
> @@ -339,6 +339,20 @@ typedef struct xfs_error_injection {
>  
>  
>  /*
> + * Speculative preallocation trimming.
> + */
> +#define XFS_EOFBLOCKS_VERSION		1
> +struct xfs_eofblocks {
> +	__u32		eof_version;
> +	__u32		eof_flags;
> +	unsigned char	pad[12];
> +};

12 bytes of padding is a bit wierd at this point. It's 
problematic for 32bit userspace on 64 bit kernels, in that the size
of the structure can end up different (i.e. 20 bytes on 32b, 24 bytes
on 64b) depending on the architectures natural alignment.

I can also see that adding multiple extra variables to the structure
are quite likely (e.g. per-ag control, start/end inode numbers,
etc), so 12 bytes of padding really isn't sufficient, IMO. I'd tend
to pad out to, say, 128 bytes rather than 32, just in case. i.e:

	__u64		pad[15];

And then take away from this padding space as you add functioanlity
in fucture patches.

This extra padding means the version number won't need to increase
any time soon, as it will be a while before we run out of either
padding or flag space, instead of as soon as we add a new function
to the ioctl....

> +
> +/* eof_flags values */
> +#define XFS_EOF_FLAGS_SYNC		0x01	/* sync/wait mode scan */

I kind of prefer flags being defined by (1 << 0) style to keep
larger flag numbers concise, but that's not a big deal.

> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 0e0232c..ad4352f 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -42,6 +42,7 @@
>  #include "xfs_inode_item.h"
>  #include "xfs_export.h"
>  #include "xfs_trace.h"
> +#include "xfs_icache.h"
>  
>  #include <linux/capability.h>
>  #include <linux/dcache.h>
> @@ -1602,6 +1603,21 @@ xfs_file_ioctl(
>  		error = xfs_errortag_clearall(mp, 1);
>  		return -error;
>  
> +	case XFS_IOC_FREE_EOFBLOCKS: {
> +		struct xfs_eofblocks eofb;
> +		int flags;
> +
> +		if (copy_from_user(&eofb, arg, sizeof(eofb)))
> +			return -XFS_ERROR(EFAULT);
> +
> +		if (eofb.eof_version != XFS_EOFBLOCKS_VERSION)
> +			return -XFS_ERROR(EINVAL);

Checking that no unsupported flags are set here is necessary. Also,
checking the padding is zero is probably a good idea, as it will
force applications to zero the structure properly before being able
to use this interface properly....

> +		flags = (eofb.eof_flags & XFS_EOF_FLAGS_SYNC) ? SYNC_WAIT : SYNC_TRYLOCK;

Line if a bit too long. However, would it be better to place this
inside xfs_icache_free_eofblocks()?

> +		error = xfs_icache_free_eofblocks(mp, flags, &eofb);

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 07/10] xfs: make xfs_quota_type() non-static
  2012-10-05 14:17 ` [PATCH v5 07/10] xfs: make xfs_quota_type() non-static Brian Foster
@ 2012-10-23  1:31   ` Dave Chinner
  0 siblings, 0 replies; 43+ messages in thread
From: Dave Chinner @ 2012-10-23  1:31 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Fri, Oct 05, 2012 at 10:17:13AM -0400, Brian Foster wrote:
> Make xfs_quota_type() into a non-static function. This is to
> support quota ID filtering in the eofblocks ioctl().
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Reviewed-by: Dave Chinner <dchinner@redhat.com>
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan
  2012-10-05 14:17 ` [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan Brian Foster
@ 2012-10-23  1:42   ` Dave Chinner
  2012-10-24 16:18     ` Brian Foster
  0 siblings, 1 reply; 43+ messages in thread
From: Dave Chinner @ 2012-10-23  1:42 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Fri, Oct 05, 2012 at 10:17:14AM -0400, Brian Foster wrote:
> Support quota ID filtering in the eofblocks scan. The caller must
> set the XFS_EOF_FLAGS_QUOTA flags bit, quota ID and quota type. The
> associated quota type must be enabled on the mounted filesystem.

I'm wondering if this even needs quota enabled to filter based on a
uid, gid, or prid?

The quota part of it seem irrelevant to the filtering that is being
executed - we are only matching against the inode uid/gid/prid, not
against dquots - and I can see situations where just being able to
filter on a given uid/gid might be useful in a multi-user
environment regardless of whether quotas are being used.

Indeed, even testing is made much easier if we don't have to
juggle quota mount options to test the filtering....

> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index 35efdda..b39970b 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -1171,6 +1171,22 @@ xfs_reclaim_inodes_count(
>  }
>  
>  STATIC int
> +xfs_inode_match_quota_id(
> +	struct xfs_inode	*ip,
> +	struct xfs_eofblocks	*eofb)
> +{
> +	unsigned int type = xfs_quota_type(eofb->eof_q_type);
> +	if (type == XFS_DQ_USER)
> +		return ip->i_d.di_uid == eofb->eof_q_id;
> +	else if (type == XFS_DQ_GROUP)
> +		return ip->i_d.di_gid == eofb->eof_q_id;
> +	else if (type == XFS_DQ_PROJ)
> +		return xfs_get_projid(ip) == eofb->eof_q_id;
> +
> +	return 0;
> +}

Why do you need xfs_quota_type() here? why not just:

	switch (type) {
	case XQM_USRQUOTA:
		return ip->i_d.di_uid == eofb->eof_q_id;
	case XQM_GRPQUOTA:
		return ip->i_d.di_gid == eofb->eof_q_id;
	case XQM_PRJQUOTA:
		return xfs_get_projid(ip) == eofb->eof_q_id;
	default:
		break;
	}

	return 0;

> @@ -1194,6 +1211,13 @@ xfs_inode_free_eofblocks(
>  	    mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_DIRTY))
>  		return 0;
>  
> +	if (eofb) {
> +		/* filter by quota id */
> +		if (eofb->eof_flags & XFS_EOF_FLAGS_QUOTA &&
> +		    !xfs_inode_match_quota_id(ip, eofb))
> +			return 0;
> +	}
> +
>  	ret = xfs_free_eofblocks(ip->i_mount, ip, true);
>  
>  	/* don't revisit the inode if we're not waiting */
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index ad4352f..547363b 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1613,6 +1613,14 @@ xfs_file_ioctl(
>  		if (eofb.eof_version != XFS_EOFBLOCKS_VERSION)
>  			return -XFS_ERROR(EINVAL);
>  
> +		if (eofb.eof_flags & XFS_EOF_FLAGS_QUOTA) {
> +			unsigned int type = xfs_quota_type(eofb.eof_q_type);
> +			if ((type == XFS_DQ_USER && !XFS_IS_UQUOTA_ON(mp)) ||
> +			    (type == XFS_DQ_GROUP && !XFS_IS_GQUOTA_ON(mp)) ||
> +			    (type == XFS_DQ_PROJ && !XFS_IS_PQUOTA_ON(mp)))
> +				return -XFS_ERROR(EINVAL);
> +		}

Again, I don't really see the reason for needing xfs_quota_type()
here, and whether the quota check is relevant at all...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 09/10] xfs: add minimum file size filtering to eofblocks scan
  2012-10-05 14:17 ` [PATCH v5 09/10] xfs: add minimum file size " Brian Foster
@ 2012-10-23  1:43   ` Dave Chinner
  0 siblings, 0 replies; 43+ messages in thread
From: Dave Chinner @ 2012-10-23  1:43 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Fri, Oct 05, 2012 at 10:17:15AM -0400, Brian Foster wrote:
> Support minimum file size filtering in the eofblocks scan. The
> caller must set the XFS_EOF_FLAGS_MINFILESIZE flags bit and minimum
> file size value in bytes.
> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>

Looks good.

Reviewed-by: Dave Chinner <dchinner@redhat.com>

-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 10/10] xfs: add background scanning to clear eofblocks inodes
  2012-10-05 14:17 ` [PATCH v5 10/10] xfs: add background scanning to clear eofblocks inodes Brian Foster
@ 2012-10-23  1:55   ` Dave Chinner
  0 siblings, 0 replies; 43+ messages in thread
From: Dave Chinner @ 2012-10-23  1:55 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Fri, Oct 05, 2012 at 10:17:16AM -0400, Brian Foster wrote:
> Create a new mount workqueue and delayed_work to enable background
> scanning and freeing of eofblocks inodes. The scanner kicks in once
> speculative preallocation occurs and stops requeueing itself when
> no eofblocks inodes exist.
> 
> The scan interval is based on the new
> 'background_prealloc_discard_period' tunable (default to 5m). The

"discard" - probably a bad idea to use a word that is associated
with similar but different functionality (i.e. mount -o discard).

Something like "speculative_prealloc_lifetime" might be a better way
of describing what it defines. i.e. speculative prealloc won't be
kept longer than this period on clean inodes. That keeps the sysctl
function independent of the underlying implementation, and it also
means we can use it as a catchall for any future speculative
allocation functionality we decide to add....

....
> diff --git a/fs/xfs/xfs_globals.c b/fs/xfs/xfs_globals.c
> index 76e81cf..efea30a 100644
> --- a/fs/xfs/xfs_globals.c
> +++ b/fs/xfs/xfs_globals.c
> @@ -40,4 +40,5 @@ xfs_param_t xfs_params = {
>  	.rotorstep	= {	1,		1,		255	},
>  	.inherit_nodfrg	= {	0,		1,		1	},
>  	.fstrm_timer	= {	1,		30*100,		3600*100},
> +	.eofb_timer	= {	1,		300,		7200},

2 hours as the maximum might be too short for some people. e.g. once
a day might be all that some people will want this to happen because
they have slowly growing log files - they stay in cache, will most
likely trigger the "keep prealloc" heuristic, but can often go 5
minutes without being modified.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 00/10] speculative preallocation inode tracking
  2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
                   ` (10 preceding siblings ...)
  2012-10-19 21:02 ` [PATCH v5 00/10] speculative preallocation inode tracking Mark Tinguely
@ 2012-10-23 19:10 ` Ben Myers
  11 siblings, 0 replies; 43+ messages in thread
From: Ben Myers @ 2012-10-23 19:10 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

Hey Brian,

On Fri, Oct 05, 2012 at 10:17:06AM -0400, Brian Foster wrote:
> This is v3 of the speculative preallocation inode tracking patchset.

I was going to pull this in based upon Mark's review, but it looks like Dave
still has some questions related to patches 5, 8, and 10.  If you can address
those items we can proceed.  I think at this point there is not a need to
repost the entire series... just keep any updates in this thread.

Regards,
	Ben

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-23  1:31   ` Dave Chinner
@ 2012-10-24 16:16     ` Brian Foster
  2012-10-24 19:27       ` Dave Chinner
  0 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-24 16:16 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 10/22/2012 09:31 PM, Dave Chinner wrote:
> On Fri, Oct 05, 2012 at 10:17:12AM -0400, Brian Foster wrote:
...
>>  /*
>> + * Speculative preallocation trimming.
>> + */
>> +#define XFS_EOFBLOCKS_VERSION		1
>> +struct xfs_eofblocks {
>> +	__u32		eof_version;
>> +	__u32		eof_flags;
>> +	unsigned char	pad[12];
>> +};
> 
> 12 bytes of padding is a bit wierd at this point. It's 
> problematic for 32bit userspace on 64 bit kernels, in that the size
> of the structure can end up different (i.e. 20 bytes on 32b, 24 bytes
> on 64b) depending on the architectures natural alignment.
> 
> I can also see that adding multiple extra variables to the structure
> are quite likely (e.g. per-ag control, start/end inode numbers,
> etc), so 12 bytes of padding really isn't sufficient, IMO. I'd tend
> to pad out to, say, 128 bytes rather than 32, just in case. i.e:
> 
> 	__u64		pad[15];
> 
> And then take away from this padding space as you add functioanlity
> in fucture patches.
> 
> This extra padding means the version number won't need to increase
> any time soon, as it will be a while before we run out of either
> padding or flag space, instead of as soon as we add a new function
> to the ioctl....
> 

Ok, I'll convert the file size parameter to a __u64 (as per discussed in
the spaceman eofblocks command thread) and expand the padding.

>> +
>> +/* eof_flags values */
>> +#define XFS_EOF_FLAGS_SYNC		0x01	/* sync/wait mode scan */
> 
> I kind of prefer flags being defined by (1 << 0) style to keep
> larger flag numbers concise, but that's not a big deal.
> 

Eh, I find this cleaner than looking at the raw values as well.

>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>> index 0e0232c..ad4352f 100644
>> --- a/fs/xfs/xfs_ioctl.c
>> +++ b/fs/xfs/xfs_ioctl.c
>> @@ -42,6 +42,7 @@
>>  #include "xfs_inode_item.h"
>>  #include "xfs_export.h"
>>  #include "xfs_trace.h"
>> +#include "xfs_icache.h"
>>  
>>  #include <linux/capability.h>
>>  #include <linux/dcache.h>
>> @@ -1602,6 +1603,21 @@ xfs_file_ioctl(
>>  		error = xfs_errortag_clearall(mp, 1);
>>  		return -error;
>>  
>> +	case XFS_IOC_FREE_EOFBLOCKS: {
>> +		struct xfs_eofblocks eofb;
>> +		int flags;
>> +
>> +		if (copy_from_user(&eofb, arg, sizeof(eofb)))
>> +			return -XFS_ERROR(EFAULT);
>> +
>> +		if (eofb.eof_version != XFS_EOFBLOCKS_VERSION)
>> +			return -XFS_ERROR(EINVAL);
> 
> Checking that no unsupported flags are set here is necessary. Also,
> checking the padding is zero is probably a good idea, as it will
> force applications to zero the structure properly before being able
> to use this interface properly....
> 

Ok.

>> +		flags = (eofb.eof_flags & XFS_EOF_FLAGS_SYNC) ? SYNC_WAIT : SYNC_TRYLOCK;
> 
> Line if a bit too long. However, would it be better to place this
> inside xfs_icache_free_eofblocks()?
> 

Are you suggesting to eliminate the flags parameter to
xfs_icache_free_eofblocks? The reason for the current interface is that
the background scan caller doesn't require the eofb parameter, so I
decided to generalize the sync/wait parameter in the caller.

If we want to push it into xfs_icache_free_eofblocks(), I think it would
be better at that point to eliminate the flags param and either infer
SYNC_TRYLOCK on a NULL eofb or to require an eofb and pass one with
a cleared XFS_EOF_FLAGS_SYNC from the background scan. Thoughts?

Brian

>> +		error = xfs_icache_free_eofblocks(mp, flags, &eofb);
> 
> Cheers,
> 
> Dave.
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan
  2012-10-23  1:42   ` Dave Chinner
@ 2012-10-24 16:18     ` Brian Foster
  2012-10-24 19:41       ` Dave Chinner
  0 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-24 16:18 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 10/22/2012 09:42 PM, Dave Chinner wrote:
> On Fri, Oct 05, 2012 at 10:17:14AM -0400, Brian Foster wrote:
>> Support quota ID filtering in the eofblocks scan. The caller must
>> set the XFS_EOF_FLAGS_QUOTA flags bit, quota ID and quota type. The
>> associated quota type must be enabled on the mounted filesystem.
> 
> I'm wondering if this even needs quota enabled to filter based on a
> uid, gid, or prid?
> 

Not really...

> The quota part of it seem irrelevant to the filtering that is being
> executed - we are only matching against the inode uid/gid/prid, not
> against dquots - and I can see situations where just being able to
> filter on a given uid/gid might be useful in a multi-user
> environment regardless of whether quotas are being used.
> 

This is how my rfc version was implemented (with a comment to the same
effect). I added the explicit checking in response to:

http://oss.sgi.com/archives/xfs/2012-09/msg00024.html

> Indeed, even testing is made much easier if we don't have to
> juggle quota mount options to test the filtering....
> 

I agree. I suppose the check can make a bit of sense if the current
group/proj mode of the mount conflicts with the eofb request, but as
you've outlined above, we're checking against the inode values so I
don't think it introduces a functional problem.

Regardless, could you confirm your reasoning here with regard to the
previous comments before I go and make this change? ;)

>> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
>> index 35efdda..b39970b 100644
>> --- a/fs/xfs/xfs_icache.c
>> +++ b/fs/xfs/xfs_icache.c
>> @@ -1171,6 +1171,22 @@ xfs_reclaim_inodes_count(
>>  }
>>  
>>  STATIC int
>> +xfs_inode_match_quota_id(
>> +	struct xfs_inode	*ip,
>> +	struct xfs_eofblocks	*eofb)
>> +{
>> +	unsigned int type = xfs_quota_type(eofb->eof_q_type);
>> +	if (type == XFS_DQ_USER)
>> +		return ip->i_d.di_uid == eofb->eof_q_id;
>> +	else if (type == XFS_DQ_GROUP)
>> +		return ip->i_d.di_gid == eofb->eof_q_id;
>> +	else if (type == XFS_DQ_PROJ)
>> +		return xfs_get_projid(ip) == eofb->eof_q_id;
>> +
>> +	return 0;
>> +}
> 
> Why do you need xfs_quota_type() here? why not just:
> 
> 	switch (type) {
> 	case XQM_USRQUOTA:
> 		return ip->i_d.di_uid == eofb->eof_q_id;
> 	case XQM_GRPQUOTA:
> 		return ip->i_d.di_gid == eofb->eof_q_id;
> 	case XQM_PRJQUOTA:
> 		return xfs_get_projid(ip) == eofb->eof_q_id;
> 	default:
> 		break;
> 	}
> 
> 	return 0;
> 

Technically we don't really need xfs_quota_type(). I just thought it
cleaner to use the same mapping we use throughout XFS (i.e., I don't see
those definitions used elsewhere in XFS)...

>> @@ -1194,6 +1211,13 @@ xfs_inode_free_eofblocks(
>>  	    mapping_tagged(VFS_I(ip)->i_mapping, PAGECACHE_TAG_DIRTY))
>>  		return 0;
>>  
>> +	if (eofb) {
>> +		/* filter by quota id */
>> +		if (eofb->eof_flags & XFS_EOF_FLAGS_QUOTA &&
>> +		    !xfs_inode_match_quota_id(ip, eofb))
>> +			return 0;
>> +	}
>> +
>>  	ret = xfs_free_eofblocks(ip->i_mount, ip, true);
>>  
>>  	/* don't revisit the inode if we're not waiting */
>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>> index ad4352f..547363b 100644
>> --- a/fs/xfs/xfs_ioctl.c
>> +++ b/fs/xfs/xfs_ioctl.c
>> @@ -1613,6 +1613,14 @@ xfs_file_ioctl(
>>  		if (eofb.eof_version != XFS_EOFBLOCKS_VERSION)
>>  			return -XFS_ERROR(EINVAL);
>>  
>> +		if (eofb.eof_flags & XFS_EOF_FLAGS_QUOTA) {
>> +			unsigned int type = xfs_quota_type(eofb.eof_q_type);
>> +			if ((type == XFS_DQ_USER && !XFS_IS_UQUOTA_ON(mp)) ||
>> +			    (type == XFS_DQ_GROUP && !XFS_IS_GQUOTA_ON(mp)) ||
>> +			    (type == XFS_DQ_PROJ && !XFS_IS_PQUOTA_ON(mp)))
>> +				return -XFS_ERROR(EINVAL);
>> +		}
> 
> Again, I don't really see the reason for needing xfs_quota_type()
> here, and whether the quota check is relevant at all...
> 

Same comment as above... would you prefer I just use the XQM_
definitions and lose the xfs_quota_type() patch?

Brian

> Cheers,
> 
> Dave.
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl
  2012-10-24 16:16     ` Brian Foster
@ 2012-10-24 19:27       ` Dave Chinner
  0 siblings, 0 replies; 43+ messages in thread
From: Dave Chinner @ 2012-10-24 19:27 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Wed, Oct 24, 2012 at 12:16:55PM -0400, Brian Foster wrote:
> On 10/22/2012 09:31 PM, Dave Chinner wrote:
> > On Fri, Oct 05, 2012 at 10:17:12AM -0400, Brian Foster wrote:
> >> +		flags = (eofb.eof_flags & XFS_EOF_FLAGS_SYNC) ? SYNC_WAIT : SYNC_TRYLOCK;
> > 
> > Line if a bit too long. However, would it be better to place this
> > inside xfs_icache_free_eofblocks()?
> 
> Are you suggesting to eliminate the flags parameter to
> xfs_icache_free_eofblocks? The reason for the current interface is that
> the background scan caller doesn't require the eofb parameter, so I
> decided to generalize the sync/wait parameter in the caller.

That's on possibility. I was thinking more of flags = 0, and parsing
the eofb structure for meaning inside xfs_icache_free_eofblocks().
i.e. it overrides flags.

> If we want to push it into xfs_icache_free_eofblocks(), I think it would
> be better at that point to eliminate the flags param and either infer
> SYNC_TRYLOCK on a NULL eofb or to require an eofb and pass one with
> a cleared XFS_EOF_FLAGS_SYNC from the background scan. Thoughts?

Sounds good - a NULL eofb meaning "default background scan" works
for me. If we want anything other than default scan parameters, use
the eofb to be precise....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan
  2012-10-24 16:18     ` Brian Foster
@ 2012-10-24 19:41       ` Dave Chinner
  2012-10-24 23:02         ` Brian Foster
  0 siblings, 1 reply; 43+ messages in thread
From: Dave Chinner @ 2012-10-24 19:41 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Wed, Oct 24, 2012 at 12:18:58PM -0400, Brian Foster wrote:
> On 10/22/2012 09:42 PM, Dave Chinner wrote:
> > On Fri, Oct 05, 2012 at 10:17:14AM -0400, Brian Foster wrote:
> >> Support quota ID filtering in the eofblocks scan. The caller must
> >> set the XFS_EOF_FLAGS_QUOTA flags bit, quota ID and quota type. The
> >> associated quota type must be enabled on the mounted filesystem.
> > 
> > I'm wondering if this even needs quota enabled to filter based on a
> > uid, gid, or prid?
> > 
> 
> Not really...
> 
> > The quota part of it seem irrelevant to the filtering that is being
> > executed - we are only matching against the inode uid/gid/prid, not
> > against dquots - and I can see situations where just being able to
> > filter on a given uid/gid might be useful in a multi-user
> > environment regardless of whether quotas are being used.
> > 
> 
> This is how my rfc version was implemented (with a comment to the same
> effect). I added the explicit checking in response to:
> 
> http://oss.sgi.com/archives/xfs/2012-09/msg00024.html

Ok, that was in the context of "we need to be specific about what
the ID is". Now the question is "is quota ID the right ID to use?".
Different question. ;)

> > Indeed, even testing is made much easier if we don't have to
> > juggle quota mount options to test the filtering....
> > 
> 
> I agree. I suppose the check can make a bit of sense if the current
> group/proj mode of the mount conflicts with the eofb request, but as
> you've outlined above, we're checking against the inode values so I
> don't think it introduces a functional problem.

Right, and if we even get the separate project quota inode patches
merged, then all three can be valid at once. Indeed, that makes me
ask the question - should we be able to filter on multiple IDs in a
single pass (e.g. walk Jim's files in the jasper project)?

Seeing as it is just a filter, enabling the above (even though it
would be rarely used) might make sense. i.e. separate field for each
id. It's not like we have to keep the size of the structure to a
minimum...

Cheers,

Dave.
> > Why do you need xfs_quota_type() here? why not just:
> > 
> > 	switch (type) {
> > 	case XQM_USRQUOTA:
> > 		return ip->i_d.di_uid == eofb->eof_q_id;
> > 	case XQM_GRPQUOTA:
> > 		return ip->i_d.di_gid == eofb->eof_q_id;
> > 	case XQM_PRJQUOTA:
> > 		return xfs_get_projid(ip) == eofb->eof_q_id;
> > 	default:
> > 		break;
> > 	}
> > 
> > 	return 0;
> > 
> 
> Technically we don't really need xfs_quota_type(). I just thought it
> cleaner to use the same mapping we use throughout XFS (i.e., I don't see
> those definitions used elsewhere in XFS)...

XQM_* are the userspace XFS quota interface definitions. They are
what xfs_quota uses. They just so happen to be the same as the
generic quota definitions except the generic quotas don't define
project quota types (hence the need for mapping).  See
include/uapi/linux/dqblk_xfs.h.

I just figured that rather than mapping something unknown to
something known, just using something known in the first place is
much simpler...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan
  2012-10-24 19:41       ` Dave Chinner
@ 2012-10-24 23:02         ` Brian Foster
  2012-10-25  0:02           ` Dave Chinner
  0 siblings, 1 reply; 43+ messages in thread
From: Brian Foster @ 2012-10-24 23:02 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 10/24/2012 03:41 PM, Dave Chinner wrote:
> On Wed, Oct 24, 2012 at 12:18:58PM -0400, Brian Foster wrote:
>> On 10/22/2012 09:42 PM, Dave Chinner wrote:
>>> On Fri, Oct 05, 2012 at 10:17:14AM -0400, Brian Foster wrote:
>>>> Support quota ID filtering in the eofblocks scan. The caller must
>>>> set the XFS_EOF_FLAGS_QUOTA flags bit, quota ID and quota type. The
>>>> associated quota type must be enabled on the mounted filesystem.
>>>
>>> I'm wondering if this even needs quota enabled to filter based on a
>>> uid, gid, or prid?
>>>
>>
>> Not really...
>>
>>> The quota part of it seem irrelevant to the filtering that is being
>>> executed - we are only matching against the inode uid/gid/prid, not
>>> against dquots - and I can see situations where just being able to
>>> filter on a given uid/gid might be useful in a multi-user
>>> environment regardless of whether quotas are being used.
>>>
>>
>> This is how my rfc version was implemented (with a comment to the same
>> effect). I added the explicit checking in response to:
>>
>> http://oss.sgi.com/archives/xfs/2012-09/msg00024.html
> 
> Ok, that was in the context of "we need to be specific about what
> the ID is". Now the question is "is quota ID the right ID to use?".
> Different question. ;)
> 

Fair enough. :) I just want to make sure we're on the same page. I'll
take this as an Ok to remove the quota enabled checks. The resulting
behavior will now be that quota ID filtering is supported regardless of
whether quotas are enabled on the targeted mount.

>>> Indeed, even testing is made much easier if we don't have to
>>> juggle quota mount options to test the filtering....
>>>
>>
>> I agree. I suppose the check can make a bit of sense if the current
>> group/proj mode of the mount conflicts with the eofb request, but as
>> you've outlined above, we're checking against the inode values so I
>> don't think it introduces a functional problem.
> 
> Right, and if we even get the separate project quota inode patches
> merged, then all three can be valid at once. Indeed, that makes me
> ask the question - should we be able to filter on multiple IDs in a
> single pass (e.g. walk Jim's files in the jasper project)?
> 

That sounds vaguely familiar, perhaps I've skimmed over such a patchset,
but it's not immediately clear to me. Interesting idea nonetheless and I
see no idea why we couldn't support such a mechanism if the id's are
independent...

> Seeing as it is just a filter, enabling the above (even though it
> would be rarely used) might make sense. i.e. separate field for each
> id. It's not like we have to keep the size of the structure to a
> minimum...
> 

... but that is also starting to further complicate the filtering and
testing required. I'm also not quite sure how we would currently specify
a union of id's in the flags given that we migrated towards the use of
the real quota type field (eof_q_type) rather than my original approach
of specifying the quota id type as a flag.

Would we be sufficiently prepared for the future if we broke the id into
three fields, i.e.:

struct xfs_eofblocks {
	...
	__u32	eof_uquota_id;
	__u32	eof_gquota_id;
	__u32	eof_pquota_id;
	__u32	eof_q_type;
	...
};

... but the in kernel implementation remains as is by using the
associated field based on the quota type? To be honest, even if we had a
way to specify multiple id's, I think I would prefer to leave the
implementation as is and do the enhanced filtering in future patches
(e.g., get the data structure right, but EINVAL for now if multiple id's
are flagged), if for nothing else than the testing I've already done
against this implementation. But I suspect if there isn't currently a
standard way to specify multiple quota types, we're fated to have to
update the structure with a new version anyways..?

Perhaps I need to take a look at this project quota inode patchset you
refer to...

> Cheers,
> 
> Dave.

Heh... almost stopped reading here.. ;)

>>> Why do you need xfs_quota_type() here? why not just:
>>>
>>> 	switch (type) {
>>> 	case XQM_USRQUOTA:
>>> 		return ip->i_d.di_uid == eofb->eof_q_id;
>>> 	case XQM_GRPQUOTA:
>>> 		return ip->i_d.di_gid == eofb->eof_q_id;
>>> 	case XQM_PRJQUOTA:
>>> 		return xfs_get_projid(ip) == eofb->eof_q_id;
>>> 	default:
>>> 		break;
>>> 	}
>>>
>>> 	return 0;
>>>
>>
>> Technically we don't really need xfs_quota_type(). I just thought it
>> cleaner to use the same mapping we use throughout XFS (i.e., I don't see
>> those definitions used elsewhere in XFS)...
> 
> XQM_* are the userspace XFS quota interface definitions. They are
> what xfs_quota uses. They just so happen to be the same as the
> generic quota definitions except the generic quotas don't define
> project quota types (hence the need for mapping).  See
> include/uapi/linux/dqblk_xfs.h.
> 
> I just figured that rather than mapping something unknown to
> something known, just using something known in the first place is
> much simpler...
> 

Right, at the time I think I dug through the various quota type
definitions, saw that we provided this mapping function and got the
impression we should be consistent throughout the filesystem. But I
think your point here is that the expected input should be the
definitions we already define for userspace whereas I was thinking the
generic definitions were ubiquitous from userspace.

Now that I have a quick look, I don't see the XQM_* definitions used
anywhere in xfsprogs or the kernel. I see XFS_{USER,GROUP,PROJ}_QUOTA
defined and mapped to USRQUOTA/GRPQUOTA/PRJQUOTA in xfsprogs, but then
the former not defined in the kernel. :/

Brian

> Cheers,
> 
> Dave.
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan
  2012-10-24 23:02         ` Brian Foster
@ 2012-10-25  0:02           ` Dave Chinner
  2012-10-25  0:29             ` Brian Foster
  0 siblings, 1 reply; 43+ messages in thread
From: Dave Chinner @ 2012-10-25  0:02 UTC (permalink / raw)
  To: Brian Foster; +Cc: xfs

On Wed, Oct 24, 2012 at 07:02:33PM -0400, Brian Foster wrote:
> On 10/24/2012 03:41 PM, Dave Chinner wrote:
> > On Wed, Oct 24, 2012 at 12:18:58PM -0400, Brian Foster wrote:
> >> On 10/22/2012 09:42 PM, Dave Chinner wrote:
> >>> On Fri, Oct 05, 2012 at 10:17:14AM -0400, Brian Foster wrote:
> >>>> Support quota ID filtering in the eofblocks scan. The caller must
> >>>> set the XFS_EOF_FLAGS_QUOTA flags bit, quota ID and quota type. The
> >>>> associated quota type must be enabled on the mounted filesystem.
> >>>
> >>> I'm wondering if this even needs quota enabled to filter based on a
> >>> uid, gid, or prid?
> >>>
> >>
> >> Not really...
> >>
> >>> The quota part of it seem irrelevant to the filtering that is being
> >>> executed - we are only matching against the inode uid/gid/prid, not
> >>> against dquots - and I can see situations where just being able to
> >>> filter on a given uid/gid might be useful in a multi-user
> >>> environment regardless of whether quotas are being used.
> >>>
> >>
> >> This is how my rfc version was implemented (with a comment to the same
> >> effect). I added the explicit checking in response to:
> >>
> >> http://oss.sgi.com/archives/xfs/2012-09/msg00024.html
> > 
> > Ok, that was in the context of "we need to be specific about what
> > the ID is". Now the question is "is quota ID the right ID to use?".
> > Different question. ;)
> > 
> 
> Fair enough. :) I just want to make sure we're on the same page. I'll
> take this as an Ok to remove the quota enabled checks. The resulting
> behavior will now be that quota ID filtering is supported regardless of
> whether quotas are enabled on the targeted mount.

Right, and that point it isn't a quota ID - it's a uid, gid or prid
we are filtering on....

> > Seeing as it is just a filter, enabling the above (even though it
> > would be rarely used) might make sense. i.e. separate field for each
> > id. It's not like we have to keep the size of the structure to a
> > minimum...
> > 
> 
> ... but that is also starting to further complicate the filtering and
> testing required. I'm also not quite sure how we would currently specify
> a union of id's in the flags given that we migrated towards the use of
> the real quota type field (eof_q_type) rather than my original approach
> of specifying the quota id type as a flag.

Well, quota type just goes away completely - it's irrelevant.

> Would we be sufficiently prepared for the future if we broke the id into
> three fields, i.e.:
> 
> struct xfs_eofblocks {
> 	...
> 	__u32	eof_uquota_id;
> 	__u32	eof_gquota_id;
> 	__u32	eof_pquota_id;
> 	__u32	eof_q_type;
> 	...

I'd suggest uid, gid and prid as the types, dropping the quota type
altogether. Then add specific flags to indicate each field is set
rather than a flag to say "look at the q_type field for which id
field to read"...

i.e. the API is indepedent of quota configurations and quota IDs,
but is still capable of expressing such control when required.

> ... but the in kernel implementation remains as is by using the
> associated field based on the quota type? To be honest, even if we had a
> way to specify multiple id's, I think I would prefer to leave the
> implementation as is and do the enhanced filtering in future patches
> (e.g., get the data structure right, but EINVAL for now if multiple id's
> are flagged),

Sure, that's fine.

> if for nothing else than the testing I've already done
> against this implementation. But I suspect if there isn't currently a
> standard way to specify multiple quota types, we're fated to have to
> update the structure with a new version anyways..?

There is a standard way of recording a *single* "quota ID" in kernel
space now - see quota.h:

struct kqid {                   /* Type in which we store the quota identifier */
        union {
                kuid_t uid;
                kgid_t gid;
                kprojid_t projid;
        };
        enum quota_type type;  /* USRQUOTA (uid) or GRPQUOTA (gid) or PRJQUOTA (projid) */
}

and a bunch or wrappers for manipulating them. That's the sort of
thing that we'd need to build internally for filtering so we end up
with user namespace support. This is shiny and new, just merged in
3.7-rc1....

> > XQM_* are the userspace XFS quota interface definitions. They are
> > what xfs_quota uses. They just so happen to be the same as the
> > generic quota definitions except the generic quotas don't define
> > project quota types (hence the need for mapping).  See
> > include/uapi/linux/dqblk_xfs.h.
> > 
> > I just figured that rather than mapping something unknown to
> > something known, just using something known in the first place is
> > much simpler...
> > 
> 
> Right, at the time I think I dug through the various quota type
> definitions, saw that we provided this mapping function and got the
> impression we should be consistent throughout the filesystem. But I
> think your point here is that the expected input should be the
> definitions we already define for userspace whereas I was thinking the
> generic definitions were ubiquitous from userspace.

Well, the generic definitions never defined project quota. If you
look now in the 3.7 codebase you'll see that PRJQUOTA has been added
to include/linux/quota.h, as has a kprojid_t type.  Hence project
quota IDs are now considered a first class quota citizen by the
generic quota *kernel* code.  However, include/uapi/linux/quota.h is
completely unchanged, so the generic quota userspace interface still
knows nothing about project IDs, which means the only actual user
facing definition is still in include/uapi/linux/dqblks_xfs.h -
XQM_PRJQUOTA. Hence we assume a quota ID that doesn't match anything
the generic quota understands to be a project quota as that's the
only way stuff works in a predictable, reliable manner...

The origin of this mess are historic as XFS brought along it's own
quota implementation and API from Irix years ago because the linux
quota subsystem was not robust or fully featured enough to even
consider integration at the time (e.g. generic quotas didn't even
support journalled quotas). So, yeah, a mess but one that is slowly
getting cleaned up.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan
  2012-10-25  0:02           ` Dave Chinner
@ 2012-10-25  0:29             ` Brian Foster
  0 siblings, 0 replies; 43+ messages in thread
From: Brian Foster @ 2012-10-25  0:29 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On 10/24/2012 08:02 PM, Dave Chinner wrote:
> On Wed, Oct 24, 2012 at 07:02:33PM -0400, Brian Foster wrote:
>> On 10/24/2012 03:41 PM, Dave Chinner wrote:
>>> On Wed, Oct 24, 2012 at 12:18:58PM -0400, Brian Foster wrote:
>>>> On 10/22/2012 09:42 PM, Dave Chinner wrote:
...
>>> Seeing as it is just a filter, enabling the above (even though it
>>> would be rarely used) might make sense. i.e. separate field for each
>>> id. It's not like we have to keep the size of the structure to a
>>> minimum...
>>>
>>
>> ... but that is also starting to further complicate the filtering and
>> testing required. I'm also not quite sure how we would currently specify
>> a union of id's in the flags given that we migrated towards the use of
>> the real quota type field (eof_q_type) rather than my original approach
>> of specifying the quota id type as a flag.
> 
> Well, quota type just goes away completely - it's irrelevant.
> 
>> Would we be sufficiently prepared for the future if we broke the id into
>> three fields, i.e.:
>>
>> struct xfs_eofblocks {
>> 	...
>> 	__u32	eof_uquota_id;
>> 	__u32	eof_gquota_id;
>> 	__u32	eof_pquota_id;
>> 	__u32	eof_q_type;
>> 	...
> 
> I'd suggest uid, gid and prid as the types, dropping the quota type
> altogether. Then add specific flags to indicate each field is set
> rather than a flag to say "look at the q_type field for which id
> field to read"...
> 

Ah, I see your point now. The filter parameters basically become
inode-centric rather than quota-centric. This makes sense.

> i.e. the API is indepedent of quota configurations and quota IDs,
> but is still capable of expressing such control when required.
> 
>> ... but the in kernel implementation remains as is by using the
>> associated field based on the quota type? To be honest, even if we had a
>> way to specify multiple id's, I think I would prefer to leave the
>> implementation as is and do the enhanced filtering in future patches
>> (e.g., get the data structure right, but EINVAL for now if multiple id's
>> are flagged),
> 
> Sure, that's fine.
> 
>> if for nothing else than the testing I've already done
>> against this implementation. But I suspect if there isn't currently a
>> standard way to specify multiple quota types, we're fated to have to
>> update the structure with a new version anyways..?
> 
> There is a standard way of recording a *single* "quota ID" in kernel
> space now - see quota.h:
> 
> struct kqid {                   /* Type in which we store the quota identifier */
>         union {
>                 kuid_t uid;
>                 kgid_t gid;
>                 kprojid_t projid;
>         };
>         enum quota_type type;  /* USRQUOTA (uid) or GRPQUOTA (gid) or PRJQUOTA (projid) */
> }
> 
> and a bunch or wrappers for manipulating them. That's the sort of
> thing that we'd need to build internally for filtering so we end up
> with user namespace support. This is shiny and new, just merged in
> 3.7-rc1....
> 

Ok, I do remember taking a cursory look at this set. I didn't realize it
was merged. Thanks for the explanation.

Brian

>>> XQM_* are the userspace XFS quota interface definitions. They are
>>> what xfs_quota uses. They just so happen to be the same as the
>>> generic quota definitions except the generic quotas don't define
>>> project quota types (hence the need for mapping).  See
>>> include/uapi/linux/dqblk_xfs.h.
>>>
>>> I just figured that rather than mapping something unknown to
>>> something known, just using something known in the first place is
>>> much simpler...
>>>
>>
>> Right, at the time I think I dug through the various quota type
>> definitions, saw that we provided this mapping function and got the
>> impression we should be consistent throughout the filesystem. But I
>> think your point here is that the expected input should be the
>> definitions we already define for userspace whereas I was thinking the
>> generic definitions were ubiquitous from userspace.
> 
> Well, the generic definitions never defined project quota. If you
> look now in the 3.7 codebase you'll see that PRJQUOTA has been added
> to include/linux/quota.h, as has a kprojid_t type.  Hence project
> quota IDs are now considered a first class quota citizen by the
> generic quota *kernel* code.  However, include/uapi/linux/quota.h is
> completely unchanged, so the generic quota userspace interface still
> knows nothing about project IDs, which means the only actual user
> facing definition is still in include/uapi/linux/dqblks_xfs.h -
> XQM_PRJQUOTA. Hence we assume a quota ID that doesn't match anything
> the generic quota understands to be a project quota as that's the
> only way stuff works in a predictable, reliable manner...
> 
> The origin of this mess are historic as XFS brought along it's own
> quota implementation and API from Irix years ago because the linux
> quota subsystem was not robust or fully featured enough to even
> consider integration at the time (e.g. generic quotas didn't even
> support journalled quotas). So, yeah, a mess but one that is slowly
> getting cleaned up.
> 
> Cheers,
> 
> Dave.
> 

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2012-10-25  0:29 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-10-05 14:17 [PATCH v5 00/10] speculative preallocation inode tracking Brian Foster
2012-10-05 14:17 ` [PATCH v5 01/10] xfs: add EOFBLOCKS inode tagging/untagging Brian Foster
2012-10-05 14:17 ` [PATCH v5 02/10] xfs: support a tag-based inode_ag_iterator Brian Foster
2012-10-05 14:17 ` [PATCH v5 03/10] xfs: create helper to check whether to free eofblocks on inode Brian Foster
2012-10-23  0:58   ` Dave Chinner
2012-10-05 14:17 ` [PATCH v5 04/10] xfs: make xfs_free_eofblocks() non-static, return EAGAIN on trylock failure Brian Foster
2012-10-05 14:17 ` [PATCH v5 05/10] xfs: create function to scan and clear EOFBLOCKS inodes Brian Foster
2012-10-23  1:01   ` Dave Chinner
2012-10-05 14:17 ` [PATCH v5 06/10] xfs: add XFS_IOC_FREE_EOFBLOCKS ioctl Brian Foster
2012-10-11 14:13   ` Ben Myers
2012-10-11 22:35     ` Brian Foster
2012-10-15 22:46       ` Ben Myers
2012-10-15 23:49         ` Dave Chinner
2012-10-16  1:39           ` Dave Chinner
2012-10-17 22:40             ` Ben Myers
2012-10-18 12:16               ` Brian Foster
2012-10-18 15:46                 ` Ben Myers
2012-10-18 16:23                   ` Brian Foster
2012-10-22  7:34                 ` Dave Chinner
2012-10-22 13:23                   ` Brian Foster
2012-10-22 22:22                     ` Dave Chinner
2012-10-23  1:31   ` Dave Chinner
2012-10-24 16:16     ` Brian Foster
2012-10-24 19:27       ` Dave Chinner
2012-10-05 14:17 ` [PATCH v5 07/10] xfs: make xfs_quota_type() non-static Brian Foster
2012-10-23  1:31   ` Dave Chinner
2012-10-05 14:17 ` [PATCH v5 08/10] xfs: add quota id filtering to eofblocks scan Brian Foster
2012-10-23  1:42   ` Dave Chinner
2012-10-24 16:18     ` Brian Foster
2012-10-24 19:41       ` Dave Chinner
2012-10-24 23:02         ` Brian Foster
2012-10-25  0:02           ` Dave Chinner
2012-10-25  0:29             ` Brian Foster
2012-10-05 14:17 ` [PATCH v5 09/10] xfs: add minimum file size " Brian Foster
2012-10-23  1:43   ` Dave Chinner
2012-10-05 14:17 ` [PATCH v5 10/10] xfs: add background scanning to clear eofblocks inodes Brian Foster
2012-10-23  1:55   ` Dave Chinner
2012-10-19 21:02 ` [PATCH v5 00/10] speculative preallocation inode tracking Mark Tinguely
2012-10-21 14:00   ` Brian Foster
2012-10-21 17:53     ` Mark Tinguely
2012-10-21 20:31       ` Mark Tinguely
2012-10-21 22:28       ` Dave Chinner
2012-10-23 19:10 ` Ben Myers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.