linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 0/2] Control filesystem balances (kernel side)
@ 2010-10-30  0:07 Hugo Mills
  2010-10-30  0:07 ` [patch 1/2] Balance progress monitoring Hugo Mills
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Hugo Mills @ 2010-10-30  0:07 UTC (permalink / raw)
  To: linux-btrfs

   These two patches give a degree of control over balance operations.
The first makes it possible to get an idea of how much work remains to
do, by tracking the number of block groups (chunks) that need to be
moved/rewritten. The second patch allows a running balance operation
to be cancelled when the current block group has been moved.

   One fundamental question, though -- is the progress monitor
function best implemented as an ioctl, as I've done here, or should it
be two or three sysfs files? I'm thinking of /proc/mdstat...
Obviously, /proc/mdstat would never get into /sys, but exposing the
"expected" and "remaining" values as files has an attractive
simplicity to it.

   The user-space side of things are in a separate patch series, to
follow.

   Please be gentle with me, this is my first (serious, non-trivial)
kernel patch. :)

   Hugo.


-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- "No!  My collection of rare, incurable diseases! Violated!" ---   


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 1/2] Balance progress monitoring.
  2010-10-30  0:07 [patch 0/2] Control filesystem balances (kernel side) Hugo Mills
@ 2010-10-30  0:07 ` Hugo Mills
  2010-10-30 13:37   ` Hugo Mills
  2010-10-30 13:39   ` [patch 1/2] Balance progress monitoring (updated) Hugo Mills
  2010-10-30  0:07 ` [patch 2/2] Cancel filesystem balance Hugo Mills
  2010-10-30 17:44 ` [patch 0/2] Control filesystem balances (kernel side) Goffredo Baroncelli
  2 siblings, 2 replies; 14+ messages in thread
From: Hugo Mills @ 2010-10-30  0:07 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Hugo Mills

This patch introduces a basic form of progress monitoring for balance
operations, by counting the number of block groups remaining. The
information is exposed to userspace by an ioctl.

Signed-off-by: Hugo Mills <hugo@carfax.org.uk>

---
 fs/btrfs/ctree.h   |    9 ++++++++
 fs/btrfs/disk-io.c |    2 +
 fs/btrfs/ioctl.c   |   34 ++++++++++++++++++++++++++++++++
 fs/btrfs/ioctl.h   |    7 ++++++
 fs/btrfs/volumes.c |   55 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 5 files changed, 105 insertions(+), 2 deletions(-)

Index: linux-mainline/fs/btrfs/ctree.h
===================================================================
--- linux-mainline.orig/fs/btrfs/ctree.h	2010-10-26 18:03:38.000000000 +0100
+++ linux-mainline/fs/btrfs/ctree.h	2010-10-29 17:20:43.860460761 +0100
@@ -803,6 +803,11 @@
 	struct list_head cluster_list;
 };
 
+struct btrfs_balance_info {
+	u64 expected;
+	u64 completed;
+};
+
 struct reloc_control;
 struct btrfs_device;
 struct btrfs_fs_devices;
@@ -1010,6 +1015,10 @@
 	unsigned metadata_ratio;
 
 	void *bdev_holder;
+
+	/* Keep track of any rebalance operations on this FS */
+	spinlock_t balance_info_lock;
+	struct btrfs_balance_info *balance_info;
 };
 
 /*
Index: linux-mainline/fs/btrfs/ioctl.c
===================================================================
--- linux-mainline.orig/fs/btrfs/ioctl.c	2010-10-26 18:03:38.000000000 +0100
+++ linux-mainline/fs/btrfs/ioctl.c	2010-10-29 17:21:26.128742389 +0100
@@ -1984,6 +1984,38 @@
 	return 0;
 }
 
+/*
+ * Return the current status of any balance operation
+ */
+long btrfs_ioctl_balance_progress(
+	struct btrfs_fs_info *fs_info,
+	struct btrfs_ioctl_balance_progress __user *user_dest)
+{
+	int ret = 0;
+	struct btrfs_ioctl_balance_progress dest;
+
+	spin_lock(&fs_info->balance_info_lock);
+	if (!fs_info->balance_info) {
+		ret = -EINVAL;
+		goto error;
+	}
+
+	dest.expected = fs_info->balance_info->expected;
+	dest.completed = fs_info->balance_info->completed;
+
+	spin_unlock(&fs_info->balance_info_lock);
+
+	if (copy_to_user(user_dest, &dest,
+			 sizeof(struct btrfs_ioctl_balance_progress)))
+		return -EFAULT;
+
+	return 0;
+
+error:
+	spin_unlock(&fs_info->balance_info_lock);
+	return ret;
+}
+
 long btrfs_ioctl(struct file *file, unsigned int
 		cmd, unsigned long arg)
 {
@@ -2017,6 +2049,8 @@
 		return btrfs_ioctl_rm_dev(root, argp);
 	case BTRFS_IOC_BALANCE:
 		return btrfs_balance(root->fs_info->dev_root);
+	case BTRFS_IOC_BALANCE_PROGRESS:
+		return btrfs_ioctl_balance_progress(root->fs_info, argp);
 	case BTRFS_IOC_CLONE:
 		return btrfs_ioctl_clone(file, arg, 0, 0, 0);
 	case BTRFS_IOC_CLONE_RANGE:
Index: linux-mainline/fs/btrfs/ioctl.h
===================================================================
--- linux-mainline.orig/fs/btrfs/ioctl.h	2010-10-26 18:03:38.000000000 +0100
+++ linux-mainline/fs/btrfs/ioctl.h	2010-10-29 17:05:44.447028825 +0100
@@ -138,6 +138,11 @@
 	struct btrfs_ioctl_space_info spaces[0];
 };
 
+struct btrfs_ioctl_balance_progress {
+	__u64 expected;
+	__u64 completed;
+};
+
 #define BTRFS_IOC_SNAP_CREATE _IOW(BTRFS_IOCTL_MAGIC, 1, \
 				   struct btrfs_ioctl_vol_args)
 #define BTRFS_IOC_DEFRAG _IOW(BTRFS_IOCTL_MAGIC, 2, \
@@ -178,4 +183,6 @@
 #define BTRFS_IOC_DEFAULT_SUBVOL _IOW(BTRFS_IOCTL_MAGIC, 19, u64)
 #define BTRFS_IOC_SPACE_INFO _IOWR(BTRFS_IOCTL_MAGIC, 20, \
 				    struct btrfs_ioctl_space_args)
+#define BTRFS_IOC_BALANCE_PROGRESS _IOR(BTRFS_IOCTL_MAGIC, 21, \
+					struct btrfs_ioctl_balance_progress)
 #endif
Index: linux-mainline/fs/btrfs/volumes.c
===================================================================
--- linux-mainline.orig/fs/btrfs/volumes.c	2010-10-26 18:03:38.000000000 +0100
+++ linux-mainline/fs/btrfs/volumes.c	2010-10-29 17:23:40.463279287 +0100
@@ -1902,6 +1902,7 @@
 	struct btrfs_root *chunk_root = dev_root->fs_info->chunk_root;
 	struct btrfs_trans_handle *trans;
 	struct btrfs_key found_key;
+	struct btrfs_balance_status *bal_info;
 
 	if (dev_root->fs_info->sb->s_flags & MS_RDONLY)
 		return -EROFS;
@@ -1909,6 +1910,18 @@
 	mutex_lock(&dev_root->fs_info->volume_mutex);
 	dev_root = dev_root->fs_info->dev_root;
 
+	dev_root->fs_info->balance_info = kmalloc(
+		sizeof(struct btrfs_balance_info),
+		GFP_NOFS);
+	if (!dev_root->fs_info->balance_info) {
+		ret = -ENOSPC;
+		goto error_no_status;
+	}
+	bal_info = dev_root->fs_info->balance_info;
+	bal_info->expected = -1; /* One less than actually counted,
+				    because chunk 0 is special */
+	bal_info->completed = 0;
+
 	/* step one make some room on all the devices */
 	list_for_each_entry(device, devices, dev_list) {
 		old_size = device->total_bytes;
@@ -1932,10 +1945,40 @@
 		btrfs_end_transaction(trans, dev_root);
 	}
 
-	/* step two, relocate all the chunks */
+	/* step two, count the chunks */
 	path = btrfs_alloc_path();
-	BUG_ON(!path);
+	if (!path) {
+		ret = -ENOSPC;
+		goto error;
+	}
+
+	key.objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
+	key.offset = (u64)-1;
+	key.type = BTRFS_CHUNK_ITEM_KEY;
+
+	ret = btrfs_search_slot(NULL, chunk_root, &key, path, 0, 0);
+	if (ret <= 0) {
+		printk(KERN_ERR "btrfs: Failed to find the last chunk.\n");
+		BUG();
+	}
+
+	while (1) {
+		ret = btrfs_previous_item(chunk_root, path, 0,
+					  BTRFS_CHUNK_ITEM_KEY);
+		if (ret)
+			break;
+
+		bal_info->expected++;
+	}
+
+	btrfs_free_path(path);
+	path = btrfs_alloc_path();
+	if (!path) {
+		ret = -ENOSPC;
+		goto error;
+	}
 
+	/* step three, relocate all the chunks */
 	key.objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
 	key.offset = (u64)-1;
 	key.type = BTRFS_CHUNK_ITEM_KEY;
@@ -1976,10 +2019,18 @@
 					   found_key.offset);
 		BUG_ON(ret && ret != -ENOSPC);
 		key.offset = found_key.offset - 1;
+		bal_info->completed++;
+		printk(KERN_INFO "btrfs: balance: %llu/%llu block groups completed\n",
+		       bal_info->completed, bal_info->expected);
 	}
 	ret = 0;
 error:
 	btrfs_free_path(path);
+	spin_lock(&dev_root->fs_info->balance_info_lock);
+	kfree(dev_root->fs_info->balance_info);
+	dev_root->fs_info->balance_info = NULL;
+	spin_unlock(&dev_root->fs_info->balance_info_lock);
+error_no_status:
 	mutex_unlock(&dev_root->fs_info->volume_mutex);
 	return ret;
 }
Index: linux-mainline/fs/btrfs/disk-io.c
===================================================================
--- linux-mainline.orig/fs/btrfs/disk-io.c	2010-10-29 17:19:12.404178865 +0100
+++ linux-mainline/fs/btrfs/disk-io.c	2010-10-29 17:20:02.022161666 +0100
@@ -1591,6 +1591,7 @@
 	spin_lock_init(&fs_info->ref_cache_lock);
 	spin_lock_init(&fs_info->fs_roots_radix_lock);
 	spin_lock_init(&fs_info->delayed_iput_lock);
+	spin_lock_init(&fs_info->balance_info_lock);
 
 	init_completion(&fs_info->kobj_unregister);
 	fs_info->tree_root = tree_root;
@@ -1616,6 +1617,7 @@
 	fs_info->sb = sb;
 	fs_info->max_inline = 8192 * 1024;
 	fs_info->metadata_ratio = 0;
+	fs_info->balance_info = NULL;
 
 	fs_info->thread_pool_size = min_t(unsigned long,
 					  num_online_cpus() + 2, 8);



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 2/2] Cancel filesystem balance.
  2010-10-30  0:07 [patch 0/2] Control filesystem balances (kernel side) Hugo Mills
  2010-10-30  0:07 ` [patch 1/2] Balance progress monitoring Hugo Mills
@ 2010-10-30  0:07 ` Hugo Mills
  2010-10-30 17:44 ` [patch 0/2] Control filesystem balances (kernel side) Goffredo Baroncelli
  2 siblings, 0 replies; 14+ messages in thread
From: Hugo Mills @ 2010-10-30  0:07 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Hugo Mills

This patch adds an ioctl for cancelling a btrfs balance operation
mid-flight. The ioctl simply sets a flag, and the operation terminates
after the current block group move has completed.

Signed-off-by: Hugo Mills <hugo@carfax.org.uk>

---
 fs/btrfs/ctree.h   |    1 +
 fs/btrfs/ioctl.c   |   25 +++++++++++++++++++++++++
 fs/btrfs/ioctl.h   |    1 +
 fs/btrfs/volumes.c |    7 ++++++-
 4 files changed, 33 insertions(+), 1 deletion(-)

Index: linux-mainline/fs/btrfs/ctree.h
===================================================================
--- linux-mainline.orig/fs/btrfs/ctree.h	2010-10-29 17:20:43.860460761 +0100
+++ linux-mainline/fs/btrfs/ctree.h	2010-10-29 17:24:06.622214467 +0100
@@ -806,6 +806,7 @@
 struct btrfs_balance_info {
 	u64 expected;
 	u64 completed;
+	int cancel_pending;
 };
 
 struct reloc_control;
Index: linux-mainline/fs/btrfs/ioctl.c
===================================================================
--- linux-mainline.orig/fs/btrfs/ioctl.c	2010-10-29 17:21:26.128742389 +0100
+++ linux-mainline/fs/btrfs/ioctl.c	2010-10-29 17:27:51.933043374 +0100
@@ -2016,6 +2016,29 @@
 	return ret;
 }
 
+/*
+ * Cancel a running balance operation
+ */
+long btrfs_ioctl_balance_cancel(struct btrfs_fs_info *fs_info)
+{
+	int err = 0;
+
+	spin_lock(&fs_info->balance_info_lock);
+	if(!fs_info->balance_info) {
+		err = -EINVAL;
+		goto error;
+	}
+	if(fs_info->balance_info->cancel_pending) {
+		err = -ECANCELED;
+		goto error;
+	}
+	fs_info->balance_info->cancel_pending = 1;
+
+error:
+	spin_unlock(&fs_info->balance_info_lock);
+	return err;
+}
+
 long btrfs_ioctl(struct file *file, unsigned int
 		cmd, unsigned long arg)
 {
@@ -2051,6 +2074,8 @@
 		return btrfs_balance(root->fs_info->dev_root);
 	case BTRFS_IOC_BALANCE_PROGRESS:
 		return btrfs_ioctl_balance_progress(root->fs_info, argp);
+	case BTRFS_IOC_BALANCE_CANCEL:
+		return btrfs_ioctl_balance_cancel(root->fs_info);
 	case BTRFS_IOC_CLONE:
 		return btrfs_ioctl_clone(file, arg, 0, 0, 0);
 	case BTRFS_IOC_CLONE_RANGE:
Index: linux-mainline/fs/btrfs/ioctl.h
===================================================================
--- linux-mainline.orig/fs/btrfs/ioctl.h	2010-10-29 17:05:44.447028825 +0100
+++ linux-mainline/fs/btrfs/ioctl.h	2010-10-29 17:24:06.642213653 +0100
@@ -185,4 +185,5 @@
 				    struct btrfs_ioctl_space_args)
 #define BTRFS_IOC_BALANCE_PROGRESS _IOR(BTRFS_IOCTL_MAGIC, 21, \
 					struct btrfs_ioctl_balance_progress)
+#define BTRFS_IOC_BALANCE_CANCEL _IO(BTRFS_IOCTL_MAGIC, 22)
 #endif
Index: linux-mainline/fs/btrfs/volumes.c
===================================================================
--- linux-mainline.orig/fs/btrfs/volumes.c	2010-10-29 17:23:40.463279287 +0100
+++ linux-mainline/fs/btrfs/volumes.c	2010-10-29 17:24:06.652213246 +0100
@@ -1921,6 +1921,7 @@
 	bal_info->expected = -1; /* One less than actually counted,
 				    because chunk 0 is special */
 	bal_info->completed = 0;
+	bal_info->cancel_pending = 0;
 
 	/* step one make some room on all the devices */
 	list_for_each_entry(device, devices, dev_list) {
@@ -1983,7 +1984,7 @@
 	key.offset = (u64)-1;
 	key.type = BTRFS_CHUNK_ITEM_KEY;
 
-	while (1) {
+	while (!bal_info->cancel_pending) {
 		ret = btrfs_search_slot(NULL, chunk_root, &key, path, 0, 0);
 		if (ret < 0)
 			goto error;
@@ -2024,6 +2025,10 @@
 		       bal_info->completed, bal_info->expected);
 	}
 	ret = 0;
+	if(bal_info->cancel_pending) {
+		printk(KERN_INFO "btrfs: balance cancelled\n");
+		ret = -EINTR;
+	}
 error:
 	btrfs_free_path(path);
 	spin_lock(&dev_root->fs_info->balance_info_lock);



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 1/2] Balance progress monitoring.
  2010-10-30  0:07 ` [patch 1/2] Balance progress monitoring Hugo Mills
@ 2010-10-30 13:37   ` Hugo Mills
  2010-10-30 13:39   ` [patch 1/2] Balance progress monitoring (updated) Hugo Mills
  1 sibling, 0 replies; 14+ messages in thread
From: Hugo Mills @ 2010-10-30 13:37 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

On Sat, Oct 30, 2010 at 01:07:27AM +0100, Hugo Mills wrote:
> This patch introduces a basic form of progress monitoring for balance
> operations, by counting the number of block groups remaining. The
> information is exposed to userspace by an ioctl.

   Dammit. An unrefreshed quilt patch let an error get through (see
below). Updated patch in a few moments.

   Hugo.

> Index: linux-mainline/fs/btrfs/volumes.c
> ===================================================================
> --- linux-mainline.orig/fs/btrfs/volumes.c	2010-10-26 18:03:38.000000000 +0100
> +++ linux-mainline/fs/btrfs/volumes.c	2010-10-29 17:23:40.463279287 +0100
> @@ -1902,6 +1902,7 @@
>  	struct btrfs_root *chunk_root = dev_root->fs_info->chunk_root;
>  	struct btrfs_trans_handle *trans;
>  	struct btrfs_key found_key;
> +	struct btrfs_balance_status *bal_info;

+       struct btrfs_balance_info *bal_info;


-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
      --- <dragon> A linked list is still a binary tree. Just a ---      
                          very unbalanced one.                           

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [patch 1/2] Balance progress monitoring (updated)
  2010-10-30  0:07 ` [patch 1/2] Balance progress monitoring Hugo Mills
  2010-10-30 13:37   ` Hugo Mills
@ 2010-10-30 13:39   ` Hugo Mills
  2010-11-01  8:06     ` liubo
  1 sibling, 1 reply; 14+ messages in thread
From: Hugo Mills @ 2010-10-30 13:39 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 7323 bytes --]

This patch introduces a basic form of progress monitoring for balance
operations, by counting the number of block groups remaining. The
information is exposed to userspace by an ioctl.

Signed-off-by: Hugo Mills <hugo@carfax.org.uk>

---
This patch replaces the one previously posted, correcting a minor error.

 fs/btrfs/ctree.h   |    9 ++++++++
 fs/btrfs/disk-io.c |    2 +
 fs/btrfs/ioctl.c   |   34 ++++++++++++++++++++++++++++++++
 fs/btrfs/ioctl.h   |    7 ++++++
 fs/btrfs/volumes.c |   55 +++++++++++++++++++++++++++++++++++++++++++++++++++--
 5 files changed, 105 insertions(+), 2 deletions(-)

Index: linux-mainline/fs/btrfs/ctree.h
===================================================================
--- linux-mainline.orig/fs/btrfs/ctree.h	2010-10-26 18:03:38.000000000 +0100
+++ linux-mainline/fs/btrfs/ctree.h	2010-10-30 14:35:25.306450922 +0100
@@ -803,6 +803,11 @@
 	struct list_head cluster_list;
 };
 
+struct btrfs_balance_info {
+	u64 expected;
+	u64 completed;
+};
+
 struct reloc_control;
 struct btrfs_device;
 struct btrfs_fs_devices;
@@ -1010,6 +1015,10 @@
 	unsigned metadata_ratio;
 
 	void *bdev_holder;
+
+	/* Keep track of any rebalance operations on this FS */
+	spinlock_t balance_info_lock;
+	struct btrfs_balance_info *balance_info;
 };
 
 /*
Index: linux-mainline/fs/btrfs/ioctl.c
===================================================================
--- linux-mainline.orig/fs/btrfs/ioctl.c	2010-10-26 18:03:38.000000000 +0100
+++ linux-mainline/fs/btrfs/ioctl.c	2010-10-30 14:35:25.396447198 +0100
@@ -1984,6 +1984,38 @@
 	return 0;
 }
 
+/*
+ * Return the current status of any balance operation
+ */
+long btrfs_ioctl_balance_progress(
+	struct btrfs_fs_info *fs_info,
+	struct btrfs_ioctl_balance_progress __user *user_dest)
+{
+	int ret = 0;
+	struct btrfs_ioctl_balance_progress dest;
+
+	spin_lock(&fs_info->balance_info_lock);
+	if (!fs_info->balance_info) {
+		ret = -EINVAL;
+		goto error;
+	}
+
+	dest.expected = fs_info->balance_info->expected;
+	dest.completed = fs_info->balance_info->completed;
+
+	spin_unlock(&fs_info->balance_info_lock);
+
+	if (copy_to_user(user_dest, &dest,
+			 sizeof(struct btrfs_ioctl_balance_progress)))
+		return -EFAULT;
+
+	return 0;
+
+error:
+	spin_unlock(&fs_info->balance_info_lock);
+	return ret;
+}
+
 long btrfs_ioctl(struct file *file, unsigned int
 		cmd, unsigned long arg)
 {
@@ -2017,6 +2049,8 @@
 		return btrfs_ioctl_rm_dev(root, argp);
 	case BTRFS_IOC_BALANCE:
 		return btrfs_balance(root->fs_info->dev_root);
+	case BTRFS_IOC_BALANCE_PROGRESS:
+		return btrfs_ioctl_balance_progress(root->fs_info, argp);
 	case BTRFS_IOC_CLONE:
 		return btrfs_ioctl_clone(file, arg, 0, 0, 0);
 	case BTRFS_IOC_CLONE_RANGE:
Index: linux-mainline/fs/btrfs/ioctl.h
===================================================================
--- linux-mainline.orig/fs/btrfs/ioctl.h	2010-10-26 18:03:38.000000000 +0100
+++ linux-mainline/fs/btrfs/ioctl.h	2010-10-30 14:35:25.316450509 +0100
@@ -138,6 +138,11 @@
 	struct btrfs_ioctl_space_info spaces[0];
 };
 
+struct btrfs_ioctl_balance_progress {
+	__u64 expected;
+	__u64 completed;
+};
+
 #define BTRFS_IOC_SNAP_CREATE _IOW(BTRFS_IOCTL_MAGIC, 1, \
 				   struct btrfs_ioctl_vol_args)
 #define BTRFS_IOC_DEFRAG _IOW(BTRFS_IOCTL_MAGIC, 2, \
@@ -178,4 +183,6 @@
 #define BTRFS_IOC_DEFAULT_SUBVOL _IOW(BTRFS_IOCTL_MAGIC, 19, u64)
 #define BTRFS_IOC_SPACE_INFO _IOWR(BTRFS_IOCTL_MAGIC, 20, \
 				    struct btrfs_ioctl_space_args)
+#define BTRFS_IOC_BALANCE_PROGRESS _IOR(BTRFS_IOCTL_MAGIC, 21, \
+					struct btrfs_ioctl_balance_progress)
 #endif
Index: linux-mainline/fs/btrfs/volumes.c
===================================================================
--- linux-mainline.orig/fs/btrfs/volumes.c	2010-10-26 18:03:38.000000000 +0100
+++ linux-mainline/fs/btrfs/volumes.c	2010-10-30 14:35:25.326450096 +0100
@@ -1902,6 +1902,7 @@
 	struct btrfs_root *chunk_root = dev_root->fs_info->chunk_root;
 	struct btrfs_trans_handle *trans;
 	struct btrfs_key found_key;
+	struct btrfs_balance_info *bal_info;
 
 	if (dev_root->fs_info->sb->s_flags & MS_RDONLY)
 		return -EROFS;
@@ -1909,6 +1910,18 @@
 	mutex_lock(&dev_root->fs_info->volume_mutex);
 	dev_root = dev_root->fs_info->dev_root;
 
+	dev_root->fs_info->balance_info = kmalloc(
+		sizeof(struct btrfs_balance_info),
+		GFP_NOFS);
+	if (!dev_root->fs_info->balance_info) {
+		ret = -ENOSPC;
+		goto error_no_status;
+	}
+	bal_info = dev_root->fs_info->balance_info;
+	bal_info->expected = -1; /* One less than actually counted,
+				    because chunk 0 is special */
+	bal_info->completed = 0;
+
 	/* step one make some room on all the devices */
 	list_for_each_entry(device, devices, dev_list) {
 		old_size = device->total_bytes;
@@ -1932,10 +1945,40 @@
 		btrfs_end_transaction(trans, dev_root);
 	}
 
-	/* step two, relocate all the chunks */
+	/* step two, count the chunks */
 	path = btrfs_alloc_path();
-	BUG_ON(!path);
+	if (!path) {
+		ret = -ENOSPC;
+		goto error;
+	}
+
+	key.objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
+	key.offset = (u64)-1;
+	key.type = BTRFS_CHUNK_ITEM_KEY;
+
+	ret = btrfs_search_slot(NULL, chunk_root, &key, path, 0, 0);
+	if (ret <= 0) {
+		printk(KERN_ERR "btrfs: Failed to find the last chunk.\n");
+		BUG();
+	}
+
+	while (1) {
+		ret = btrfs_previous_item(chunk_root, path, 0,
+					  BTRFS_CHUNK_ITEM_KEY);
+		if (ret)
+			break;
+
+		bal_info->expected++;
+	}
+
+	btrfs_free_path(path);
+	path = btrfs_alloc_path();
+	if (!path) {
+		ret = -ENOSPC;
+		goto error;
+	}
 
+	/* step three, relocate all the chunks */
 	key.objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
 	key.offset = (u64)-1;
 	key.type = BTRFS_CHUNK_ITEM_KEY;
@@ -1976,10 +2019,18 @@
 					   found_key.offset);
 		BUG_ON(ret && ret != -ENOSPC);
 		key.offset = found_key.offset - 1;
+		bal_info->completed++;
+		printk(KERN_INFO "btrfs: balance: %llu/%llu block groups completed\n",
+		       bal_info->completed, bal_info->expected);
 	}
 	ret = 0;
 error:
 	btrfs_free_path(path);
+	spin_lock(&dev_root->fs_info->balance_info_lock);
+	kfree(dev_root->fs_info->balance_info);
+	dev_root->fs_info->balance_info = NULL;
+	spin_unlock(&dev_root->fs_info->balance_info_lock);
+error_no_status:
 	mutex_unlock(&dev_root->fs_info->volume_mutex);
 	return ret;
 }
Index: linux-mainline/fs/btrfs/disk-io.c
===================================================================
--- linux-mainline.orig/fs/btrfs/disk-io.c	2010-10-29 17:19:12.000000000 +0100
+++ linux-mainline/fs/btrfs/disk-io.c	2010-10-29 17:20:02.022161666 +0100
@@ -1591,6 +1591,7 @@
 	spin_lock_init(&fs_info->ref_cache_lock);
 	spin_lock_init(&fs_info->fs_roots_radix_lock);
 	spin_lock_init(&fs_info->delayed_iput_lock);
+	spin_lock_init(&fs_info->balance_info_lock);
 
 	init_completion(&fs_info->kobj_unregister);
 	fs_info->tree_root = tree_root;
@@ -1616,6 +1617,7 @@
 	fs_info->sb = sb;
 	fs_info->max_inline = 8192 * 1024;
 	fs_info->metadata_ratio = 0;
+	fs_info->balance_info = NULL;
 
 	fs_info->thread_pool_size = min_t(unsigned long,
 					  num_online_cpus() + 2, 8);

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
      --- <dragon> A linked list is still a binary tree. Just a ---      
                          very unbalanced one.                           

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 0/2] Control filesystem balances (kernel side)
  2010-10-30  0:07 [patch 0/2] Control filesystem balances (kernel side) Hugo Mills
  2010-10-30  0:07 ` [patch 1/2] Balance progress monitoring Hugo Mills
  2010-10-30  0:07 ` [patch 2/2] Cancel filesystem balance Hugo Mills
@ 2010-10-30 17:44 ` Goffredo Baroncelli
  2010-11-01 12:58   ` Xavier Nicollet
                     ` (2 more replies)
  2 siblings, 3 replies; 14+ messages in thread
From: Goffredo Baroncelli @ 2010-10-30 17:44 UTC (permalink / raw)
  To: linux-btrfs

On Saturday, 30 October, 2010, Hugo Mills wrote:
>    These two patches give a degree of control over balance operations.
> The first makes it possible to get an idea of how much work remains to
> do, by tracking the number of block groups (chunks) that need to be
> moved/rewritten. The second patch allows a running balance operation
> to be cancelled when the current block group has been moved.
> 
>    One fundamental question, though -- is the progress monitor
> function best implemented as an ioctl, as I've done here, or should it
> be two or three sysfs files? I'm thinking of /proc/mdstat...
> Obviously, /proc/mdstat would never get into /sys, but exposing the
> "expected" and "remaining" values as files has an attractive
> simplicity to it.


I like the idea that these info should be put under sysfs. Something like

/sys/btrfs/<filesystem-uuid>/
                             balance	-> info on balancing
                             devices	-> list of device (a directory of
                                           links or a file which contains 
                                           the list of devices)
                             subvolumes/ -> info on subvolume(s)
                             label       -> label of the filesystem
                             <other btrfs filesystem related knoba>

                                           


Obviously we need another btrfs command to extract an uuid from a btrfs 
filesystem like:

# btrfs filesystem get-uuid /path/to/a/btrfs/filesystem
f9b9c413-0dc8-4e3f-94f2-86faa702f519

> 
>    The user-space side of things are in a separate patch series, to
> follow.
> 
>    Please be gentle with me, this is my first (serious, non-trivial)
> kernel patch. :)
> 
>    Hugo.
> 
> 
> -- 
> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
>   PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
>    --- "No!  My collection of rare, incurable diseases! Violated!" ---   
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack@inwind.it>
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 1/2] Balance progress monitoring (updated)
  2010-10-30 13:39   ` [patch 1/2] Balance progress monitoring (updated) Hugo Mills
@ 2010-11-01  8:06     ` liubo
  2010-11-01 12:55       ` Hugo Mills
  0 siblings, 1 reply; 14+ messages in thread
From: liubo @ 2010-11-01  8:06 UTC (permalink / raw)
  To: Hugo Mills; +Cc: linux-btrfs

On 10/30/2010 09:39 PM, Hugo Mills wrote:
> This patch introduces a basic form of progress monitoring for balance
> operations, by counting the number of block groups remaining. The
> information is exposed to userspace by an ioctl.
> 

IMO, tracking the information of blocks which are balancing also makes sense. 
For example, the block information's blocknr. 
It can help us monitor better.

> Signed-off-by: Hugo Mills <hugo@carfax.org.uk>
> 
> ---
> This patch replaces the one previously posted, correcting a minor error.
> 
>  fs/btrfs/ctree.h   |    9 ++++++++
>  fs/btrfs/disk-io.c |    2 +
>  fs/btrfs/ioctl.c   |   34 ++++++++++++++++++++++++++++++++
>  fs/btrfs/ioctl.h   |    7 ++++++
>  fs/btrfs/volumes.c |   55 +++++++++++++++++++++++++++++++++++++++++++++++++++--
>  5 files changed, 105 insertions(+), 2 deletions(-)
> 
> Index: linux-mainline/fs/btrfs/ctree.h
> ===================================================================
> --- linux-mainline.orig/fs/btrfs/ctree.h	2010-10-26 18:03:38.000000000 +0100
> +++ linux-mainline/fs/btrfs/ctree.h	2010-10-30 14:35:25.306450922 +0100
> @@ -803,6 +803,11 @@
>  	struct list_head cluster_list;
>  };
>  
> +struct btrfs_balance_info {
> +	u64 expected;
> +	u64 completed;
> +};
> +
>  struct reloc_control;
>  struct btrfs_device;
>  struct btrfs_fs_devices;
> @@ -1010,6 +1015,10 @@
>  	unsigned metadata_ratio;
>  
>  	void *bdev_holder;
> +
> +	/* Keep track of any rebalance operations on this FS */
> +	spinlock_t balance_info_lock;
> +	struct btrfs_balance_info *balance_info;
>  };
>  
>  /*
> Index: linux-mainline/fs/btrfs/ioctl.c
> ===================================================================
> --- linux-mainline.orig/fs/btrfs/ioctl.c	2010-10-26 18:03:38.000000000 +0100
> +++ linux-mainline/fs/btrfs/ioctl.c	2010-10-30 14:35:25.396447198 +0100
> @@ -1984,6 +1984,38 @@
>  	return 0;
>  }
>  
> +/*
> + * Return the current status of any balance operation
> + */
> +long btrfs_ioctl_balance_progress(
> +	struct btrfs_fs_info *fs_info,
> +	struct btrfs_ioctl_balance_progress __user *user_dest)
> +{
> +	int ret = 0;
> +	struct btrfs_ioctl_balance_progress dest;
> +
> +	spin_lock(&fs_info->balance_info_lock);
> +	if (!fs_info->balance_info) {
> +		ret = -EINVAL;
> +		goto error;
> +	}
> +
> +	dest.expected = fs_info->balance_info->expected;
> +	dest.completed = fs_info->balance_info->completed;
> +
> +	spin_unlock(&fs_info->balance_info_lock);
> +
> +	if (copy_to_user(user_dest, &dest,
> +			 sizeof(struct btrfs_ioctl_balance_progress)))
> +		return -EFAULT;
> +
> +	return 0;
> +
> +error:
> +	spin_unlock(&fs_info->balance_info_lock);
> +	return ret;
> +}
> +
>  long btrfs_ioctl(struct file *file, unsigned int
>  		cmd, unsigned long arg)
>  {
> @@ -2017,6 +2049,8 @@
>  		return btrfs_ioctl_rm_dev(root, argp);
>  	case BTRFS_IOC_BALANCE:
>  		return btrfs_balance(root->fs_info->dev_root);
> +	case BTRFS_IOC_BALANCE_PROGRESS:
> +		return btrfs_ioctl_balance_progress(root->fs_info, argp);
>  	case BTRFS_IOC_CLONE:
>  		return btrfs_ioctl_clone(file, arg, 0, 0, 0);
>  	case BTRFS_IOC_CLONE_RANGE:
> Index: linux-mainline/fs/btrfs/ioctl.h
> ===================================================================
> --- linux-mainline.orig/fs/btrfs/ioctl.h	2010-10-26 18:03:38.000000000 +0100
> +++ linux-mainline/fs/btrfs/ioctl.h	2010-10-30 14:35:25.316450509 +0100
> @@ -138,6 +138,11 @@
>  	struct btrfs_ioctl_space_info spaces[0];
>  };
>  
> +struct btrfs_ioctl_balance_progress {
> +	__u64 expected;
> +	__u64 completed;
> +};
> +
>  #define BTRFS_IOC_SNAP_CREATE _IOW(BTRFS_IOCTL_MAGIC, 1, \
>  				   struct btrfs_ioctl_vol_args)
>  #define BTRFS_IOC_DEFRAG _IOW(BTRFS_IOCTL_MAGIC, 2, \
> @@ -178,4 +183,6 @@
>  #define BTRFS_IOC_DEFAULT_SUBVOL _IOW(BTRFS_IOCTL_MAGIC, 19, u64)
>  #define BTRFS_IOC_SPACE_INFO _IOWR(BTRFS_IOCTL_MAGIC, 20, \
>  				    struct btrfs_ioctl_space_args)
> +#define BTRFS_IOC_BALANCE_PROGRESS _IOR(BTRFS_IOCTL_MAGIC, 21, \
> +					struct btrfs_ioctl_balance_progress)
>  #endif
> Index: linux-mainline/fs/btrfs/volumes.c
> ===================================================================
> --- linux-mainline.orig/fs/btrfs/volumes.c	2010-10-26 18:03:38.000000000 +0100
> +++ linux-mainline/fs/btrfs/volumes.c	2010-10-30 14:35:25.326450096 +0100
> @@ -1902,6 +1902,7 @@
>  	struct btrfs_root *chunk_root = dev_root->fs_info->chunk_root;
>  	struct btrfs_trans_handle *trans;
>  	struct btrfs_key found_key;
> +	struct btrfs_balance_info *bal_info;
>  
>  	if (dev_root->fs_info->sb->s_flags & MS_RDONLY)
>  		return -EROFS;
> @@ -1909,6 +1910,18 @@
>  	mutex_lock(&dev_root->fs_info->volume_mutex);
>  	dev_root = dev_root->fs_info->dev_root;
>  
> +	dev_root->fs_info->balance_info = kmalloc(
> +		sizeof(struct btrfs_balance_info),
> +		GFP_NOFS);
> +	if (!dev_root->fs_info->balance_info) {
> +		ret = -ENOSPC;

-ENOMEM is better, for it comes from a kmalloc().

> +		goto error_no_status;
> +	}
> +	bal_info = dev_root->fs_info->balance_info;
> +	bal_info->expected = -1; /* One less than actually counted,
> +				    because chunk 0 is special */
> +	bal_info->completed = 0;
> +
>  	/* step one make some room on all the devices */
>  	list_for_each_entry(device, devices, dev_list) {
>  		old_size = device->total_bytes;
> @@ -1932,10 +1945,40 @@
>  		btrfs_end_transaction(trans, dev_root);
>  	}
>  
> -	/* step two, relocate all the chunks */
> +	/* step two, count the chunks */
>  	path = btrfs_alloc_path();
> -	BUG_ON(!path);
> +	if (!path) {
> +		ret = -ENOSPC;

ditto

> +		goto error;
> +	}
> +
> +	key.objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
> +	key.offset = (u64)-1;
> +	key.type = BTRFS_CHUNK_ITEM_KEY;
> +
> +	ret = btrfs_search_slot(NULL, chunk_root, &key, path, 0, 0);
> +	if (ret <= 0) {
> +		printk(KERN_ERR "btrfs: Failed to find the last chunk.\n");
> +		BUG();
> +	}
> +
> +	while (1) {
> +		ret = btrfs_previous_item(chunk_root, path, 0,
> +					  BTRFS_CHUNK_ITEM_KEY);
> +		if (ret)
> +			break;
> +
> +		bal_info->expected++;
> +	}
> +
> +	btrfs_free_path(path);
> +	path = btrfs_alloc_path();
> +	if (!path) {
> +		ret = -ENOSPC;

ditto

> +		goto error;
> +	}
>  
> +	/* step three, relocate all the chunks */
>  	key.objectid = BTRFS_FIRST_CHUNK_TREE_OBJECTID;
>  	key.offset = (u64)-1;
>  	key.type = BTRFS_CHUNK_ITEM_KEY;
> @@ -1976,10 +2019,18 @@
>  					   found_key.offset);
>  		BUG_ON(ret && ret != -ENOSPC);
>  		key.offset = found_key.offset - 1;
> +		bal_info->completed++;
> +		printk(KERN_INFO "btrfs: balance: %llu/%llu block groups completed\n",
> +		       bal_info->completed, bal_info->expected);

Would you please printk found_key.offset which balance code is processing?
That would be helpful.


thanks,
liubo

>  	}
>  	ret = 0;
>  error:
>  	btrfs_free_path(path);
> +	spin_lock(&dev_root->fs_info->balance_info_lock);
> +	kfree(dev_root->fs_info->balance_info);
> +	dev_root->fs_info->balance_info = NULL;
> +	spin_unlock(&dev_root->fs_info->balance_info_lock);
> +error_no_status:
>  	mutex_unlock(&dev_root->fs_info->volume_mutex);
>  	return ret;
>  }
> Index: linux-mainline/fs/btrfs/disk-io.c
> ===================================================================
> --- linux-mainline.orig/fs/btrfs/disk-io.c	2010-10-29 17:19:12.000000000 +0100
> +++ linux-mainline/fs/btrfs/disk-io.c	2010-10-29 17:20:02.022161666 +0100
> @@ -1591,6 +1591,7 @@
>  	spin_lock_init(&fs_info->ref_cache_lock);
>  	spin_lock_init(&fs_info->fs_roots_radix_lock);
>  	spin_lock_init(&fs_info->delayed_iput_lock);
> +	spin_lock_init(&fs_info->balance_info_lock);
>  
>  	init_completion(&fs_info->kobj_unregister);
>  	fs_info->tree_root = tree_root;
> @@ -1616,6 +1617,7 @@
>  	fs_info->sb = sb;
>  	fs_info->max_inline = 8192 * 1024;
>  	fs_info->metadata_ratio = 0;
> +	fs_info->balance_info = NULL;
>  
>  	fs_info->thread_pool_size = min_t(unsigned long,
>  					  num_online_cpus() + 2, 8);
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 0/2] Control filesystem balances (kernel side)
  2010-11-01 12:58   ` Xavier Nicollet
@ 2010-11-01 12:52     ` Tomasz Torcz
  0 siblings, 0 replies; 14+ messages in thread
From: Tomasz Torcz @ 2010-11-01 12:52 UTC (permalink / raw)
  To: linux-btrfs

On Mon, Nov 01, 2010 at 01:58:21PM +0100, Xavier Nicollet wrote:
> Le 30 octobre 2010 =C3=A0 19:44, Goffredo Baroncelli a =C3=A9crit:
> > I like the idea that these info should be put under sysfs. Somethin=
g like
> >=20
> > /sys/btrfs/<filesystem-uuid>/
> >                              balance	-> info on balancing
> >                              devices	-> list of device (a directory=
 of
> >                                            links or a file which co=
ntains=20
> >                                            the list of devices)
> >                              subvolumes/ -> info on subvolume(s)
> >                              label       -> label of the filesystem
> >                              <other btrfs filesystem related knoba>
>=20
> Well, mdstat stats are under /proc/mdstat.
> Is sysfs the ideal place ?

  mdstats are in sys: /sys/block/md127/md/
sync_action, sync_completed, sync_speed, reshape_position etc.

/proc file is legacy.


--=20
Tomasz Torcz               "Never underestimate the bandwidth of a stat=
ion
xmpp: zdzichubg@chrome.pl    wagon filled with backup tapes." -- Jim Gr=
ay

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 1/2] Balance progress monitoring (updated)
  2010-11-01  8:06     ` liubo
@ 2010-11-01 12:55       ` Hugo Mills
  2010-11-02  0:51         ` liubo
  0 siblings, 1 reply; 14+ messages in thread
From: Hugo Mills @ 2010-11-01 12:55 UTC (permalink / raw)
  To: liubo; +Cc: Hugo Mills, linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1038 bytes --]

On Mon, Nov 01, 2010 at 04:06:53PM +0800, liubo wrote:
> On 10/30/2010 09:39 PM, Hugo Mills wrote:
> > This patch introduces a basic form of progress monitoring for balance
> > operations, by counting the number of block groups remaining. The
> > information is exposed to userspace by an ioctl.
> > 
> 
> IMO, tracking the information of blocks which are balancing also makes sense. 
> For example, the block information's blocknr. 
> It can help us monitor better.

   I don't see how that will help. The block group IDs (which is all
that we get at this level) are effectively arbitrary 64-bit numbers,
and are what appear in the kernel logs. How could that information be
used to improve monitoring?

   I'm not ruling out the idea completely -- I just can't see at the
moment how it would be used.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
          --- Is a diversity twice as good as a university? ---          

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 0/2] Control filesystem balances (kernel side)
  2010-10-30 17:44 ` [patch 0/2] Control filesystem balances (kernel side) Goffredo Baroncelli
@ 2010-11-01 12:58   ` Xavier Nicollet
  2010-11-01 12:52     ` Tomasz Torcz
  2010-11-01 13:05   ` Hugo Mills
  2010-11-04 22:55   ` RFC: exporting info via sysfs [was Re: [patch 0/2] Control filesystem balances (kernel side)] Goffredo Baroncelli
  2 siblings, 1 reply; 14+ messages in thread
From: Xavier Nicollet @ 2010-11-01 12:58 UTC (permalink / raw)
  To: Goffredo Baroncelli; +Cc: linux-btrfs

Le 30 octobre 2010 =E0 19:44, Goffredo Baroncelli a =E9crit:
> I like the idea that these info should be put under sysfs. Something =
like
>=20
> /sys/btrfs/<filesystem-uuid>/
>                              balance	-> info on balancing
>                              devices	-> list of device (a directory o=
f
>                                            links or a file which cont=
ains=20
>                                            the list of devices)
>                              subvolumes/ -> info on subvolume(s)
>                              label       -> label of the filesystem
>                              <other btrfs filesystem related knoba>

Well, mdstat stats are under /proc/mdstat.
Is sysfs the ideal place ?

Just asking.

--=20
Xavier Nicollet
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" =
in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 0/2] Control filesystem balances (kernel side)
  2010-10-30 17:44 ` [patch 0/2] Control filesystem balances (kernel side) Goffredo Baroncelli
  2010-11-01 12:58   ` Xavier Nicollet
@ 2010-11-01 13:05   ` Hugo Mills
  2010-11-04 22:55   ` RFC: exporting info via sysfs [was Re: [patch 0/2] Control filesystem balances (kernel side)] Goffredo Baroncelli
  2 siblings, 0 replies; 14+ messages in thread
From: Hugo Mills @ 2010-11-01 13:05 UTC (permalink / raw)
  To: Goffredo Baroncelli; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2060 bytes --]

On Sat, Oct 30, 2010 at 07:44:35PM +0200, Goffredo Baroncelli wrote:
> On Saturday, 30 October, 2010, Hugo Mills wrote:
> >    One fundamental question, though -- is the progress monitor
> > function best implemented as an ioctl, as I've done here, or should it
> > be two or three sysfs files? I'm thinking of /proc/mdstat...
> > Obviously, /proc/mdstat would never get into /sys, but exposing the
> > "expected" and "remaining" values as files has an attractive
> > simplicity to it.
> 
> I like the idea that these info should be put under sysfs. Something like
> 
> /sys/btrfs/<filesystem-uuid>/

/sys/fs/btrfs/<uuid> I think. Also:
/sys/fs/btrfs/<label> as a symlink to the <uuid> directory.

>                              balance	-> info on balancing

For the one-value-per-file rule of sysfs, this should probably be
balance_expected and balance_completed, each holding a count of block
groups.

>                              devices	-> list of device (a directory of
>                                            links or a file which contains 
>                                            the list of devices)
>                              subvolumes/ -> info on subvolume(s)
>                              label       -> label of the filesystem
>                              <other btrfs filesystem related knoba>

   The other one that struck me earlier today as being useful was
tracking the progress of a dev delete operation. But that'll come
later.

> Obviously we need another btrfs command to extract an uuid from a btrfs 
> filesystem like:
> 
> # btrfs filesystem get-uuid /path/to/a/btrfs/filesystem
> f9b9c413-0dc8-4e3f-94f2-86faa702f519

   Possibly a slightly more general "fi metadata" with switches for
UUID and label?

# btrfs fi metadata [-u|--uuid] /path
# btrfs fi metadata [-l|--label] /path

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
          --- Is a diversity twice as good as a university? ---          

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [patch 1/2] Balance progress monitoring (updated)
  2010-11-01 12:55       ` Hugo Mills
@ 2010-11-02  0:51         ` liubo
  0 siblings, 0 replies; 14+ messages in thread
From: liubo @ 2010-11-02  0:51 UTC (permalink / raw)
  To: Hugo Mills, linux-btrfs

On 11/01/2010 08:55 PM, Hugo Mills wrote:
> On Mon, Nov 01, 2010 at 04:06:53PM +0800, liubo wrote:
>> On 10/30/2010 09:39 PM, Hugo Mills wrote:
>>> This patch introduces a basic form of progress monitoring for balance
>>> operations, by counting the number of block groups remaining. The
>>> information is exposed to userspace by an ioctl.
>>>
>> IMO, tracking the information of blocks which are balancing also makes sense. 
>> For example, the block information's blocknr. 
>> It can help us monitor better.
> 
>    I don't see how that will help. The block group IDs (which is all
> that we get at this level) are effectively arbitrary 64-bit numbers,
> and are what appear in the kernel logs. How could that information be
> used to improve monitoring?

64-bit numbers are also shown in btrfs-debug-tree.
With btrfs-debug-tree, it would be helpful to track balanced extent buffers.

thanks,
liubo

> 
>    I'm not ruling out the idea completely -- I just can't see at the
> moment how it would be used.
> 
>    Hugo.
> 


^ permalink raw reply	[flat|nested] 14+ messages in thread

* RFC: exporting info via sysfs [was Re: [patch 0/2] Control filesystem balances (kernel side)]
  2010-10-30 17:44 ` [patch 0/2] Control filesystem balances (kernel side) Goffredo Baroncelli
  2010-11-01 12:58   ` Xavier Nicollet
  2010-11-01 13:05   ` Hugo Mills
@ 2010-11-04 22:55   ` Goffredo Baroncelli
  2010-11-05 12:41     ` Hugo Mills
  2 siblings, 1 reply; 14+ messages in thread
From: Goffredo Baroncelli @ 2010-11-04 22:55 UTC (permalink / raw)
  To: linux-btrfs; +Cc: Hugo Mills

Hi all,

I make a prototype for exporting info from btrfs via sysfs.
Under /sys/btrfs were created two directories, named "fs" and "devices".


/sys/btrfs/fs/<fs-uuid>/
                         label		-> filesystem label
                         num_devices    -> total number of devices
                         open_devices   -> number of opened devices
                         [...]
/sys/btrfs/devices/<dev-uuid>/
                         devid          -> btrfs device number
                         fsid           -> filesystem uuid (fs-uuid)
                         major, minor   -> major minor
                         name           -> device name
                         writeable      -> is the device writeable

where <fs-uuid> is the filesystem uuid, and <dev-uuid> is the device uuid. The 
link between devices and filesystem is the <fsid> parameter of a device.

I create these structure because we should handle the case were the devices 
are present (like after a "btrfs device scan") but the filesystem aren't 
mounted.

In this case the devices/ subdirectory is populated. Instead the fs/ 
subdirectory is empty.

I don't attach a patch because the code is very ugly.
Comments ? Thoughts ?
        
Below an example of use.

$ /sbin/blkid img*
img0.img: UUID="099ea4b7-96dd-41fc-91df-0d1ab0066e05" 
UUID_SUB="1103c4e9-2dba-4b58-82ea-7c7c633fe04a" TYPE="btrfs" 
img1.img: UUID="099ea4b7-96dd-41fc-91df-0d1ab0066e05" 
UUID_SUB="d677e338-5eb0-4373-a540-78b9e7938987" TYPE="btrfs" 
img2.img: UUID="099ea4b7-96dd-41fc-91df-0d1ab0066e05" 
UUID_SUB="de5e3fbf-2400-438c-95b5-e4c876d96bed" TYPE="btrfs" 
img3.img: UUID="099ea4b7-96dd-41fc-91df-0d1ab0066e05" UUID_SUB="019b1657-
edad-488e-ad72-ccd2ea92e3ac" TYPE="btrfs"

$ (cd /sys/fs/btrfs/; for i in */*/*; do echo -e "$i:\t$(cat $i)"; done )          
devices/019b1657-edad-488e-ad72-ccd2ea92e3ac/devid:     4
devices/019b1657-edad-488e-ad72-ccd2ea92e3ac/fsid:      
099ea4b7-96dd-41fc-91df-0d1ab0066e05
devices/019b1657-edad-488e-ad72-ccd2ea92e3ac/major:     98
devices/019b1657-edad-488e-ad72-ccd2ea92e3ac/minor:     64
devices/019b1657-edad-488e-ad72-ccd2ea92e3ac/name:      /dev/ubde
devices/019b1657-edad-488e-ad72-ccd2ea92e3ac/writeable: 1
devices/1103c4e9-2dba-4b58-82ea-7c7c633fe04a/devid:     1
devices/1103c4e9-2dba-4b58-82ea-7c7c633fe04a/fsid:      
099ea4b7-96dd-41fc-91df-0d1ab0066e05
devices/1103c4e9-2dba-4b58-82ea-7c7c633fe04a/major:     98
devices/1103c4e9-2dba-4b58-82ea-7c7c633fe04a/minor:     16
devices/1103c4e9-2dba-4b58-82ea-7c7c633fe04a/name:      /dev/ubdb
devices/1103c4e9-2dba-4b58-82ea-7c7c633fe04a/writeable: 1
devices/d677e338-5eb0-4373-a540-78b9e7938987/devid:     2
devices/d677e338-5eb0-4373-a540-78b9e7938987/fsid:      
099ea4b7-96dd-41fc-91df-0d1ab0066e05
devices/d677e338-5eb0-4373-a540-78b9e7938987/major:     98
devices/d677e338-5eb0-4373-a540-78b9e7938987/minor:     32
devices/d677e338-5eb0-4373-a540-78b9e7938987/name:      /dev/ubdc
devices/d677e338-5eb0-4373-a540-78b9e7938987/writeable: 1
devices/de5e3fbf-2400-438c-95b5-e4c876d96bed/devid:     3
devices/de5e3fbf-2400-438c-95b5-e4c876d96bed/fsid:      
099ea4b7-96dd-41fc-91df-0d1ab0066e05
devices/de5e3fbf-2400-438c-95b5-e4c876d96bed/major:     98
devices/de5e3fbf-2400-438c-95b5-e4c876d96bed/minor:     48
devices/de5e3fbf-2400-438c-95b5-e4c876d96bed/name:      /dev/ubdd
devices/de5e3fbf-2400-438c-95b5-e4c876d96bed/writeable: 1
fs/099ea4b7-96dd-41fc-91df-0d1ab0066e05/blocks_used:    32768
fs/099ea4b7-96dd-41fc-91df-0d1ab0066e05/blocksize:      4096
fs/099ea4b7-96dd-41fc-91df-0d1ab0066e05/label:
fs/099ea4b7-96dd-41fc-91df-0d1ab0066e05/num_devices:    4
fs/099ea4b7-96dd-41fc-91df-0d1ab0066e05/open_devices:   4
fs/099ea4b7-96dd-41fc-91df-0d1ab0066e05/rw_devices:     4
fs/099ea4b7-96dd-41fc-91df-0d1ab0066e05/total_blocks:   2222981120
fs/099ea4b7-96dd-41fc-91df-0d1ab0066e05/total_devices:  4



On Saturday, 30 October, 2010, you (Goffredo Baroncelli) wrote:
> On Saturday, 30 October, 2010, Hugo Mills wrote:
> >    These two patches give a degree of control over balance operations.
> > The first makes it possible to get an idea of how much work remains to
> > do, by tracking the number of block groups (chunks) that need to be
> > moved/rewritten. The second patch allows a running balance operation
> > to be cancelled when the current block group has been moved.
> > 
> >    One fundamental question, though -- is the progress monitor
> > function best implemented as an ioctl, as I've done here, or should it
> > be two or three sysfs files? I'm thinking of /proc/mdstat...
> > Obviously, /proc/mdstat would never get into /sys, but exposing the
> > "expected" and "remaining" values as files has an attractive
> > simplicity to it.
> 
> 
> I like the idea that these info should be put under sysfs. Something like
> 
> /sys/btrfs/<filesystem-uuid>/
>                              balance	-> info on balancing
>                              devices	-> list of device (a directory of
>                                            links or a file which contains 
>                                            the list of devices)
>                              subvolumes/ -> info on subvolume(s)
>                              label       -> label of the filesystem
>                              <other btrfs filesystem related knoba>
> 
>                                            
> 
> 
> Obviously we need another btrfs command to extract an uuid from a btrfs 
> filesystem like:
> 
> # btrfs filesystem get-uuid /path/to/a/btrfs/filesystem
> f9b9c413-0dc8-4e3f-94f2-86faa702f519
> 
> > 
> >    The user-space side of things are in a separate patch series, to
> > follow.
> > 
> >    Please be gentle with me, this is my first (serious, non-trivial)
> > kernel patch. :)
> > 
> >    Hugo.
> > 
> > 
> > -- 
> > === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
> >   PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
> >    --- "No!  My collection of rare, incurable diseases! Violated!" ---   
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> 
> -- 
> gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) 
<kreijack@inwind.it>
> Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512
> 
> 


-- 
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijack@inwind.it>
Key fingerprint = 4769 7E51 5293 D36C 814E  C054 BF04 F161 3DC5 0512

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: RFC: exporting info via sysfs [was Re: [patch 0/2] Control filesystem balances (kernel side)]
  2010-11-04 22:55   ` RFC: exporting info via sysfs [was Re: [patch 0/2] Control filesystem balances (kernel side)] Goffredo Baroncelli
@ 2010-11-05 12:41     ` Hugo Mills
  0 siblings, 0 replies; 14+ messages in thread
From: Hugo Mills @ 2010-11-05 12:41 UTC (permalink / raw)
  To: Goffredo Baroncelli; +Cc: linux-btrfs, Hugo Mills

[-- Attachment #1: Type: text/plain, Size: 2880 bytes --]

   Hi, Goffredo,

On Thu, Nov 04, 2010 at 11:55:24PM +0100, Goffredo Baroncelli wrote:
> I make a prototype for exporting info from btrfs via sysfs.

   Good stuff. I was going to take a look at doing that this
weekend. :)

> Under /sys/btrfs were created two directories, named "fs" and "devices".
>
> /sys/btrfs/fs/<fs-uuid>/

   I'm pretty sure that /sys/btrfs won't get through any discussion on
LKML. I'd suggest /sys/fs/btrfs as the base, since that's where the
other filesystems seem to put their sysfs information.

>                          label		-> filesystem label
>                          num_devices    -> total number of devices
>                          open_devices   -> number of opened devices
>                          [...]
> /sys/btrfs/devices/<dev-uuid>/
>                          devid          -> btrfs device number
>                          fsid           -> filesystem uuid (fs-uuid)
>                          major, minor   -> major minor

   I think the major, minor should instead be be a symlink to the
relevant entry in /sys/devices/...  (as done in /sys/block/*) or
/sys/block (as done in /sys/block/md*/slaves). Call it "device".

>                          name           -> device name

   Unnecessary -- and also, I think, unlikely to get through LKML
review. Putting a device name here implies that the kernel knows
better than userspace what the name of the device is (i.e. which
device node you should be using). Having the link to /sys/block/* or
/sys/devices/... as above is, I think, all that's needed here.
Userspace should be able to convert the major/minor pair kept in
/sys/fs/btrfs/devices/<uuid>/device/dev appropriately.

>                          writeable      -> is the device writeable

> where <fs-uuid> is the filesystem uuid, and <dev-uuid> is the device uuid. The 
> link between devices and filesystem is the <fsid> parameter of a device.

   Could that be made a symlink instead? That seems to be the usual
approach in sysfs.

> I create these structure because we should handle the case were the devices 
> are present (like after a "btrfs device scan") but the filesystem aren't 
> mounted.

   ... ah, I see it can't. (Re: my previous comment)

> In this case the devices/ subdirectory is populated. Instead the fs/ 
> subdirectory is empty.
> 
> I don't attach a patch because the code is very ugly.
> Comments ? Thoughts ?

   Is it ugly because there are significant difficulties in making
btrfs or sysfs do this, or just because you hacked something together
as quickly as possible for a demo?

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- "There's a Martian war machine outside -- they want to talk ---   
                to you about a cure for the common cold."                

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 190 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-11-05 12:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-10-30  0:07 [patch 0/2] Control filesystem balances (kernel side) Hugo Mills
2010-10-30  0:07 ` [patch 1/2] Balance progress monitoring Hugo Mills
2010-10-30 13:37   ` Hugo Mills
2010-10-30 13:39   ` [patch 1/2] Balance progress monitoring (updated) Hugo Mills
2010-11-01  8:06     ` liubo
2010-11-01 12:55       ` Hugo Mills
2010-11-02  0:51         ` liubo
2010-10-30  0:07 ` [patch 2/2] Cancel filesystem balance Hugo Mills
2010-10-30 17:44 ` [patch 0/2] Control filesystem balances (kernel side) Goffredo Baroncelli
2010-11-01 12:58   ` Xavier Nicollet
2010-11-01 12:52     ` Tomasz Torcz
2010-11-01 13:05   ` Hugo Mills
2010-11-04 22:55   ` RFC: exporting info via sysfs [was Re: [patch 0/2] Control filesystem balances (kernel side)] Goffredo Baroncelli
2010-11-05 12:41     ` Hugo Mills

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).