All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/5] Add netlink file system notification interface
@ 2011-08-18 12:18 Lukas Czerner
  2011-08-18 12:18 ` [PATCH 1/5] fs: add netlink " Lukas Czerner
                   ` (4 more replies)
  0 siblings, 5 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-18 12:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel

Hello.

this is my proposal to add netlink notification interface which can be used
by file system to send various warning messages to user space via netlink.
This is actually the same what quota uses to send warnings about exceeding
quota and it can be eventually merged into this interface so we have just
one netlink messaging system for file systems.

The first PATCH adds the netlink interface itself and the rest of the
patches wire it up into the file systems (ext3,ext4,xfs,btrfs). So far it
can only send information about data and metadata ENOSPC, but it can be
easily extended, for example for using quota notification.


Here is a description of the first patch:

 There might be a lot of crazy things happening inside the file systems
 but it might result in bogus error code returned to the user space. And
 sometimes it is hard to figure out what just happened. This
 commit adds the interface which can be used by file systems to send
 better information to the user space via netlink interface, because it
 is not bound with error codes.

 Also applications might not report problems with the file systems
 correctly, hence the administrator will never know about the problem
 unless it is already too late. Also in the case of ENOSPC conditions
 even if we are checking 'df' output from cronjob, we might miss some
 ENOSPC states because it just takes snapshots of the state. Those are
 just examples.

 With this interface file system can send a message via netlink interface
 at the moment when the problem arises.

 In order to use this file system must register the netlink
 interface with init_fs_nl_family() on module initialization and the in
 can send messages with fs_nl_send_warning().

 At this point there are only two types of warning FS_NL_ENOSPC_WARN,
 which should be used in situations when file system does not have enough
 space to reserve data blocks and FS_NL_META_ENOSPC_WARN, for situations
 when file system does not have enough space to reserve metadata blocks.
 But more can be added in the future.

 The code has been based on fs/quota/netlink.c which is used to send
 quota warnings to the user space. Eventually it can be merged into this
 interface.

I have tested this with a simple tool which is testing various enospc
situations and with fallocate. You can find the tool bellow (alloc_test)

For the user space to receive the messages I have written a simple tool
(based on quota_nld code), you can find it bellow (fsmfg).

Thanks!
-Lukas

--- 
[PATCH 1/5] fs: add netlink notification interface
[PATCH 2/5] ext3: use fs netlink interface for ENOSPC conditions
[PATCH 3/5] ext4: use fs netlink interface for ENOSPC conditions
[PATCH 4/5] xfs: use fs netlink interface for ENOSPC conditions
[PATCH 5/5] btrfs: use fs netlink interface for ENOSPC conditions

fs/Makefile                  |    2 +-
fs/btrfs/extent-tree.c       |   13 ++++-
fs/btrfs/super.c             |    1 +
fs/ext3/acl.c                |    5 +-
fs/ext3/balloc.c             |   10 +++-
fs/ext3/inode.c              |    4 +-
fs/ext3/namei.c              |   10 ++--
fs/ext3/super.c              |    1 +
fs/ext3/xattr.c              |    2 +-
fs/ext4/acl.c                |    5 +-
fs/ext4/balloc.c             |   15 ++++--
fs/ext4/ext4.h               |    3 +-
fs/ext4/extents.c            |    2 +-
fs/ext4/indirect.c           |    2 +-
fs/ext4/inode.c              |   10 ++--
fs/ext4/namei.c              |   10 ++--
fs/ext4/super.c              |    1 +
fs/ext4/xattr.c              |    2 +-
fs/netlink.c                 |  107 ++++++++++++++++++++++++++++++++++++++++++
fs/xfs/linux-2.6/xfs_file.c  |    2 +
fs/xfs/linux-2.6/xfs_super.c |    1 +
fs/xfs/xfs_vnodeops.c        |   10 +++-
include/linux/ext3_fs.h      |    3 +-
include/linux/fs.h           |   11 ++++
24 files changed, 193 insertions(+), 39 deletions(-)

--- 

alloc_test.c
------------
#define _GNU_SOURCE

#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <stdint.h>
#include <limits.h>
#include <string.h>
#include <linux/types.h>


#define BSIZE 4096
#define FILECOUNT 9000000

unsigned long long
do_write(int fd, char *data, int size)
{
        unsigned long long count = 0, len = 0;
        while (1) {
                len = write(fd, data, size);
                if (errno) {
                        perror("errno");
                        break;
                }
                count += len;
        }
	fsync(fd);
        return count;
}

void write_test(char *filename, int flag)
{
	int err, fd, ps = getpagesize();
        unsigned long long count = 0;
        void *data;

	fd = open(filename, flag );
        if (fd < 0) {
                perror("open");
                exit(1);
        }

        err = posix_memalign(&data, ps, BSIZE);
        if (err) {
                perror("posix_memalign");
                exit(1);
        }
        memset(data, 0, BSIZE);
        count = do_write(fd, data, BSIZE);   
        printf("%llu Bytes written\n", count);
        close(fd);

	remove(filename);
	sync();
	errno = 0;
}

void dir_create(char *dir, long count)
{
	int fd, err = 0;
	char fname[256];
	long i = count;

	printf("0/%ld", count);
	while (i) {
		sprintf(fname, "%s/testfile_%ld", dir, i);
		fd = mkdir(fname, 0);
		if (fd < 0) {
			perror("open");
			err = 1;
			goto cleanup;
		}
		close(fd);
		i--;
		if (!(i % 10000))
			printf("\r%ld/%ld", count -i, count);
	}
	sync();
cleanup:
	printf("\n%ld files created in the directory %s\n", count -i, dir);
	while (i < count) {
		i++;
		sprintf(fname, "%s/testfile_%ld", dir, i);
		fd = rmdir(fname);
		if (fd < 0) {
			printf("name %s\n", fname);
			perror("unlink");
			exit(1);
		}
		if (!(i % 10000))
			printf("\r%ld/%ld", count -i, count);
	}
	printf("\r\n");
	sync();
}

void file_create(char *dir, long count)
{
	int fd, err = 0;
	char fname[256];
	long i = count;

	printf("0/%ld", count);
	while (i > 0) {
		sprintf(fname, "%s/testfile_%ld", dir, i);
		fd = creat(fname, 0);
		if (fd < 0) {
			perror("creat");
			err = 1;
			goto cleanup;
		}
		close(fd);
		i--;
		if (!(i % 10000))
			printf("\r%ld/%ld", count -i, count);
	}
	sync();
cleanup:
	printf("\n%ld files created in the directory %s\n", count -i, dir);
	while (i < count) {
		i++;
		sprintf(fname, "%s/testfile_%ld", dir, i);
		fd = unlink(fname);
		if (fd < 0) {
			printf("name %s\n", fname);
			perror("unlink");
			exit(1);
		}
		if (!(i % 10000))
			printf("\r%ld/%ld", count -i, count);
	}
	printf("\r\n");
	sync();
}

void test_all(char *dir)
{
	char fname[256];

	sprintf(fname, "%s/testfile", dir);

	printf("[+] Buffered write test\n");
	write_test(fname, O_RDWR | O_CREAT | O_TRUNC);
	write_test(fname, O_RDWR | O_CREAT | O_TRUNC);

	printf("[+] Direct write test\n");
	write_test(fname, O_RDWR | O_CREAT | O_TRUNC | O_DIRECT);
	write_test(fname, O_RDWR | O_CREAT | O_TRUNC | O_DIRECT);

	printf("[+] File creation test\n");
	file_create(dir, FILECOUNT);
	file_create(dir, FILECOUNT);

	printf("[+] Directory creation test\n");
	dir_create(dir, FILECOUNT);
	dir_create(dir, FILECOUNT);
}

int main(int argc, char **argv)
{
        int type;
	long count = FILECOUNT;
	char fname[256];

        if (argc < 2) {
                printf("Usage: %s <filename> n\n", argv[0]);
                printf("n=1 buffered write\nn=2 direct write\nn=3 file creation\n");
                exit(1);
        }


	if (argc == 2) {
		test_all(argv[1]);
		return 0;
	}

	sprintf(fname, "%s/testfile", argv[1]);
        type = atoi(argv[2]);
        switch (type) {
                case 1:
			printf("[+] Buffered write test\n");
                        write_test(fname, O_RDWR | O_CREAT | O_TRUNC);
                        break;
                case 2:
			printf("[+] Direct write test\n");
                        write_test(fname, O_RDWR | O_CREAT | O_TRUNC | O_DIRECT);
			break;
		case 3:
			printf("[+] File creation test\n");
			if (argc == 4) {
				count = atol(argv[3]);
			}
			file_create(argv[1], count);
			break;
		case 4:
			printf("[+] Directory creation test\n");
			if (argc == 4) {
				count = atol(argv[3]);
			}
			dir_create(argv[1], count);
			break;
		default:
			printf("Type not recognised\n");
			exit(1);
        }
        return 0;
}

fsmsg.c
-------
#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>
#include <unistd.h>
#include <getopt.h>
#include <utmp.h>
#include <errno.h>
#include <string.h>
#include <fcntl.h>
#include <limits.h>
#include <inttypes.h>

#include <netlink/genl/genl.h>
#include <netlink/genl/ctrl.h>

char *progname;

/*
 * Definition for fs netlink interface
 */
#define FS_NL_NOWARN		0
#define FS_NL_ENOSPC_WARN	1
#define FS_NL_META_ENOSPC_WARN	2

enum {
       FS_NL_A_UNSPEC,
       FS_NL_A_WARNING,
       FS_NL_A_DEV_MAJOR,
       FS_NL_A_DEV_MINOR,
       FS_NL_A_CAUSED_ID,
       __FS_NL_A_MAX,
};
#define FS_NL_A_MAX (__FS_NL_A_MAX - 1)

enum {
	FS_NL_C_UNSPEC,
	FS_NL_C_WARNING,
	__FS_NL_C_MAX,
};
#define FS_NL_C_MAX (__FS_NL_C_MAX - 1)

static const struct option options[] = {
	{ "help", 0, NULL, 'h' },
	{ "no-daemon", 0, NULL, 'F' },
	{ NULL, 0, NULL, 0 }
};

static struct nla_policy fs_nl_warn_cmd_policy[FS_NL_A_MAX+1] = {
	[FS_NL_A_WARNING] = { .type = NLA_U32 },
	[FS_NL_A_DEV_MAJOR] = { .type = NLA_U32 },
	[FS_NL_A_DEV_MINOR] = { .type = NLA_U32 },
	[FS_NL_A_CAUSED_ID] = { .type = NLA_U64 },
};

/* User options */
#define FL_NODAEMON 1

int flags;

void show_help(void)
{
	printf("Usage: %s [options]\nOptions are:\n\
 -h --help         shows this text\n\
 -F --foreground   run daemon in foreground\n", progname);
}

void die(int err, const char *string)
{
	fprintf(stderr, "fsmsg: %s\n", string);
	exit(err);
}

static void parse_options(int argc, char **argv)
{
	int opt;

	while ((opt = getopt_long(argc, argv, "VhDCFb", options, NULL)) >= 0) {
		switch (opt) {
			case 'h':
				show_help();
				exit(0);
			case 'F':
				flags |= FL_NODAEMON;
				break;
			default:
				printf("Unknown option '%c'.\n", opt);
				show_help();
				exit(1);
		}
	}
}

static int fs_nl_parser(struct nl_msg *msg, void *arg)
{
	struct nlmsghdr *nlh = nlmsg_hdr(msg);
	struct genlmsghdr *ghdr;
	struct nlattr *attrs[FS_NL_A_MAX+1];
	int ret, warntype;
	char *warn_msg;

	if (!genlmsg_valid_hdr(nlh, 0))
                return 0;
        ghdr = nlmsg_data(nlh);
	if (ghdr->cmd != FS_NL_C_WARNING)
		return 0;

	ret = genlmsg_parse(nlh, 0, attrs, FS_NL_A_MAX, fs_nl_warn_cmd_policy);
	if (ret < 0) {
		printf("Error parsing netlink message.\n");
		return ret;
	}
	if (!attrs[FS_NL_A_WARNING] ||
	    !attrs[FS_NL_A_DEV_MAJOR] || !attrs[FS_NL_A_DEV_MAJOR] ||
	    !attrs[FS_NL_A_DEV_MINOR] || !attrs[FS_NL_A_CAUSED_ID]) {
		printf("Unknown format of kernel netlink message!\n"
		       "Maybe your fsmsg is too old?\n");
		return -EINVAL;
	}
	warntype = nla_get_u32(attrs[FS_NL_A_WARNING]);

	switch (warntype) {
		case FS_NL_ENOSPC_WARN:
			warn_msg = "no space left on the file system";
			break;
		case FS_NL_META_ENOSPC_WARN:
			warn_msg = "not enough space left for metadata";
			break;
		default:
			warn_msg = "unknown file system warning";
	}

	printf("VFS: on device (%u:%u) from UID=%" PRIu64 " %s.\n",
				     nla_get_u32(attrs[FS_NL_A_DEV_MAJOR]),
				     nla_get_u32(attrs[FS_NL_A_DEV_MINOR]),
				     nla_get_u64(attrs[FS_NL_A_CAUSED_ID]),
				     warn_msg);

	return 0;
}

static struct nl_handle *init_netlink(void)
{
	struct nl_handle *handle;
	int ret, family;

	handle = nl_handle_alloc();
	if (!handle)
		die(2, "Cannot allocate netlink handle!\n");
	nl_disable_sequence_check(handle);
	ret = genl_connect(handle);
	if (ret < 0)
		die(2, "Cannot connect to netlink socket\n");
	family = genl_ctrl_resolve(handle, "FS_MSG");
	if (ret < 0)
		die(2, "Cannot resolve fs netlink name\n");

	ret = nl_socket_add_membership(handle, family);
	if (ret < 0)
		die(2, "Cannot join fs multicast group\n");

	ret = nl_socket_modify_cb(handle, NL_CB_VALID, NL_CB_CUSTOM,
			fs_nl_parser, NULL);
	if (ret < 0)
		die(2, "Cannot register callback for"
			 " netlink messages\n");

	return handle;
}

static void run(struct nl_handle *nhandle)
{
	int ret;

	while (1) {
		ret = nl_recvmsgs_default(nhandle);
		if (ret < 0)
			printf("Failed to read or parse fs netlink"
				" message: %s\n", strerror(-ret));
	}
}

int main(int argc, char **argv)
{
	struct nl_handle *nhandle;

	progname = basename(argv[0]);
	parse_options(argc, argv);

	nhandle = init_netlink();
	if (!(flags & FL_NODAEMON)) {
		daemon(0, 0);
	}
	run(nhandle);
	return 0;
}

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH 1/5] fs: add netlink notification interface
  2011-08-18 12:18 [RFC][PATCH 0/5] Add netlink file system notification interface Lukas Czerner
@ 2011-08-18 12:18 ` Lukas Czerner
  2011-08-18 15:00   ` Randy Dunlap
  2011-08-19 15:15   ` Ted Ts'o
  2011-08-18 12:18 ` [PATCH 2/5] ext3: use fs netlink interface for ENOSPC conditions Lukas Czerner
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-18 12:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, Lukas Czerner, Alexander Viro

There might be a lot of crazy things happening inside the file systems
but it might result in bogus error code returned to the user space. And
sometimes it is hard to figure out what just happened. This
commit adds the interface which can be used by file systems to send
better information to the user space via netlink interface, because it
is not bound with error codes.

Also applications might not report problems with the file systems
correctly, hence the administrator will never know about the problem
unless it is already too late. Also in the case of ENOSPC conditions
even if we are checking 'df' output from cronjob, we might miss some
ENOSPC states because it just takes snapshots of the state. Those are
just examples.

With this interface file system can send a message via netlink interface
at the moment when the problem arises.

In order to use this file system must register the netlink
interface with init_fs_nl_family() on module initialization and the in
can send messages with fs_nl_send_warning().

At this point there are only two types of warning FS_NL_ENOSPC_WARN,
which should be used in situations when file system does not have enough
space to reserve data blocks and FS_NL_META_ENOSPC_WARN, for situations
when file system does not have enough space to reserve metadata blocks.
But more can be added in the future.

The code has been based on fs/quota/netlink.c which is used to send
quota warnings to the user space. Eventually it can be merged into this
interface.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
---
 fs/Makefile        |    2 +-
 fs/netlink.c       |  107 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/fs.h |   11 +++++
 3 files changed, 119 insertions(+), 1 deletions(-)
 create mode 100644 fs/netlink.c

diff --git a/fs/Makefile b/fs/Makefile
index afc1096..7e9c61f 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -11,7 +11,7 @@ obj-y :=	open.o read_write.o file_table.o super.o \
 		attr.o bad_inode.o file.o filesystems.o namespace.o \
 		seq_file.o xattr.o libfs.o fs-writeback.o \
 		pnode.o drop_caches.o splice.o sync.o utimes.o \
-		stack.o fs_struct.o statfs.o
+		stack.o fs_struct.o statfs.o netlink.o
 
 ifeq ($(CONFIG_BLOCK),y)
 obj-y +=	buffer.o bio.o block_dev.o direct-io.o mpage.o ioprio.o
diff --git a/fs/netlink.c b/fs/netlink.c
new file mode 100644
index 0000000..15c44a1
--- /dev/null
+++ b/fs/netlink.c
@@ -0,0 +1,107 @@
+#include <linux/fs.h>
+#include <linux/cred.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <net/netlink.h>
+#include <net/genetlink.h>
+
+enum {
+	FS_NL_A_UNSPEC,
+	FS_NL_A_WARNING,
+	FS_NL_A_DEV_MAJOR,
+	FS_NL_A_DEV_MINOR,
+	FS_NL_A_CAUSED_ID,
+	__FS_NL_A_MAX,
+};
+#define FS_NL_A_MAX (__FS_NL_A_MAX - 1)
+
+enum {
+	FS_NL_C_UNSPEC,
+	FS_NL_C_WARNING,
+	__FS_NL_C_MAX,
+};
+#define FS_NL_C_MAX (__FS_NL_C_MAX - 1)
+
+
+static struct genl_family fs_genl_family = {
+	.id = GENL_ID_GENERATE,
+	.hdrsize = 0,
+	.name = "FS_MSG",
+	.version = 1,
+	.maxattr = FS_NL_A_MAX,
+};
+static int registered;
+
+/**
+ * fs_nl_send_warning - Send warning from file system to userspace
+ * @dev: The device on which the fs is mounted
+ * @warntype: The type of the warning
+ *
+ * This can be used by file systems to send a warning message to the
+ * userspace.
+ */
+
+void fs_nl_send_warning(dev_t dev, unsigned int warntype)
+{
+	static atomic_t seq;
+	struct sk_buff *skb;
+	void *msg_head;
+	int ret;
+	int msg_size = nla_total_size(sizeof(u64)) +
+			3 * nla_total_size(sizeof(u32));
+
+	/* We have to allocate using GFP_NOFS as we are called from a
+	 * filesystem performing write and thus further recursion into
+	 * the fs to free some data could cause deadlocks. */
+	skb = genlmsg_new(msg_size, GFP_NOFS);
+	if (!skb) {
+		printk(KERN_ERR
+		  "VFS: Not enough memory to send fs warning.\n");
+		return;
+	}
+	msg_head = genlmsg_put(skb, 0, atomic_add_return(1, &seq),
+			&fs_genl_family, 0, FS_NL_C_WARNING);
+	if (!msg_head) {
+		printk(KERN_ERR
+		  "VFS: Cannot store netlink header in fs warning.\n");
+		goto err_out;
+	}
+	ret = nla_put_u32(skb, FS_NL_A_WARNING, warntype);
+	if (ret)
+		goto attr_err_out;
+	ret = nla_put_u32(skb, FS_NL_A_DEV_MAJOR, MAJOR(dev));
+	if (ret)
+		goto attr_err_out;
+	ret = nla_put_u32(skb, FS_NL_A_DEV_MINOR, MINOR(dev));
+	if (ret)
+		goto attr_err_out;
+	ret = nla_put_u64(skb, FS_NL_A_CAUSED_ID, current_uid());
+	if (ret)
+		goto attr_err_out;
+	genlmsg_end(skb, msg_head);
+	genlmsg_multicast(skb, 0, fs_genl_family.id, GFP_NOFS);
+	return;
+attr_err_out:
+	printk(KERN_ERR "VFS: Not enough space to compose "
+			"fs netlink message!\n");
+err_out:
+	kfree_skb(skb);
+}
+EXPORT_SYMBOL(fs_nl_send_warning);
+
+void init_fs_nl_family(void)
+{
+	if (registered)
+		return;
+
+	if (genl_register_family(&fs_genl_family) != 0) {
+		printk(KERN_ERR
+		       "VFS: Failed to create fs netlink interface.\n");
+		return;
+	}
+	registered = 1;
+}
+EXPORT_SYMBOL(init_fs_nl_family);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 178cdb4..78ba846 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2663,5 +2663,16 @@ static inline void inode_has_no_xattr(struct inode *inode)
 		inode->i_flags |= S_NOSEC;
 }
 
+/*
+ * Definition for fs netlink interface
+ */
+#define FS_NL_NOWARN		0
+#define FS_NL_ENOSPC_WARN	1
+#define FS_NL_META_ENOSPC_WARN	2
+
+/* fs/netlink.c */
+void init_fs_nl_family(void);
+void fs_nl_send_warning(dev_t dev, unsigned int warntype);
+
 #endif /* __KERNEL__ */
 #endif /* _LINUX_FS_H */
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/5] ext3: use fs netlink interface for ENOSPC conditions
  2011-08-18 12:18 [RFC][PATCH 0/5] Add netlink file system notification interface Lukas Czerner
  2011-08-18 12:18 ` [PATCH 1/5] fs: add netlink " Lukas Czerner
@ 2011-08-18 12:18 ` Lukas Czerner
  2011-08-18 12:18 ` [PATCH 3/5] ext4: " Lukas Czerner
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-18 12:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, Lukas Czerner, Jan Kara, Ext4 Developers List

Register fs netlink interface and send proper warning if ENOSPC is
encountered. Note that we differentiate between enospc for metadata and
enospc for data.

Also fix ext3_should_retry_alloc() so we do not check for free blocks
when we are actually allocating metadata

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
CC: Jan Kara <jack@suse.cz>
CC: Ext4 Developers List <linux-ext4@vger.kernel.org>
---
 fs/ext3/acl.c           |    5 +++--
 fs/ext3/balloc.c        |   10 ++++++++--
 fs/ext3/inode.c         |    4 ++--
 fs/ext3/namei.c         |   10 +++++-----
 fs/ext3/super.c         |    1 +
 fs/ext3/xattr.c         |    2 +-
 include/linux/ext3_fs.h |    3 ++-
 7 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/fs/ext3/acl.c b/fs/ext3/acl.c
index 3091f62..b900123 100644
--- a/fs/ext3/acl.c
+++ b/fs/ext3/acl.c
@@ -322,7 +322,7 @@ retry:
 	error = ext3_set_acl(handle, inode, ACL_TYPE_ACCESS, acl);
 	ext3_journal_stop(handle);
 	if (error == -ENOSPC &&
-	    ext3_should_retry_alloc(inode->i_sb, &retries))
+	    ext3_should_retry_alloc(inode->i_sb, &retries, 0))
 		goto retry;
 out:
 	posix_acl_release(acl);
@@ -415,7 +415,8 @@ retry:
 		return PTR_ERR(handle);
 	error = ext3_set_acl(handle, inode, type, acl);
 	ext3_journal_stop(handle);
-	if (error == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
+	if (error == -ENOSPC && ext3_should_retry_alloc(inode->i_sb,
+							&retries, 0))
 		goto retry;
 
 release_and_out:
diff --git a/fs/ext3/balloc.c b/fs/ext3/balloc.c
index 6386d76..5f851ff 100644
--- a/fs/ext3/balloc.c
+++ b/fs/ext3/balloc.c
@@ -1458,6 +1458,7 @@ static int ext3_has_free_blocks(struct ext3_sb_info *sbi)
  * ext3_should_retry_alloc()
  * @sb:			super block
  * @retries		number of attemps has been made
+ * @data		allocating (data=1 or metadata=0)
  *
  * ext3_should_retry_alloc() is called when ENOSPC is returned, and if
  * it is profitable to retry the operation, this function will wait
@@ -1466,10 +1467,15 @@ static int ext3_has_free_blocks(struct ext3_sb_info *sbi)
  *
  * if the total number of retries exceed three times, return FALSE.
  */
-int ext3_should_retry_alloc(struct super_block *sb, int *retries)
+int ext3_should_retry_alloc(struct super_block *sb, int *retries, int data)
 {
-	if (!ext3_has_free_blocks(EXT3_SB(sb)) || (*retries)++ > 3)
+	if ((data && !ext3_has_free_blocks(EXT3_SB(sb))) || ++(*retries) >= 3) {
+		if (data)
+			fs_nl_send_warning(sb->s_dev, FS_NL_ENOSPC_WARN);
+		else
+			fs_nl_send_warning(sb->s_dev, FS_NL_META_ENOSPC_WARN);
 		return 0;
+	}
 
 	jbd_debug(1, "%s: retrying operation after ENOSPC\n", sb->s_id);
 
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 04da6ac..6c5f69d 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1305,7 +1305,7 @@ write_begin_failed:
 		if (pos + len > inode->i_size)
 			ext3_truncate_failed_write(inode);
 	}
-	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
+	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries, 1))
 		goto retry;
 out:
 	return ret;
@@ -1889,7 +1889,7 @@ retry:
 		if (end > isize)
 			ext3_truncate_failed_direct_write(inode);
 	}
-	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries))
+	if (ret == -ENOSPC && ext3_should_retry_alloc(inode->i_sb, &retries, 1))
 		goto retry;
 
 	if (orphan) {
diff --git a/fs/ext3/namei.c b/fs/ext3/namei.c
index 5571708..9abaebf 100644
--- a/fs/ext3/namei.c
+++ b/fs/ext3/namei.c
@@ -1725,7 +1725,7 @@ retry:
 		err = ext3_add_nondir(handle, dentry, inode);
 	}
 	ext3_journal_stop(handle);
-	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 }
@@ -1762,7 +1762,7 @@ retry:
 		err = ext3_add_nondir(handle, dentry, inode);
 	}
 	ext3_journal_stop(handle);
-	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 }
@@ -1849,7 +1849,7 @@ out_clear_inode:
 out_stop:
 	brelse(dir_block);
 	ext3_journal_stop(handle);
-	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 }
@@ -2287,7 +2287,7 @@ retry:
 	err = ext3_add_nondir(handle, dentry, inode);
 out_stop:
 	ext3_journal_stop(handle);
-	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 err_drop_inode:
@@ -2330,7 +2330,7 @@ retry:
 		iput(inode);
 	}
 	ext3_journal_stop(handle);
-	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext3_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 }
diff --git a/fs/ext3/super.c b/fs/ext3/super.c
index 7beb69a..d4d6124 100644
--- a/fs/ext3/super.c
+++ b/fs/ext3/super.c
@@ -3077,6 +3077,7 @@ static int __init init_ext3_fs(void)
         err = register_filesystem(&ext3_fs_type);
 	if (err)
 		goto out;
+	init_fs_nl_family();
 	return 0;
 out:
 	destroy_inodecache();
diff --git a/fs/ext3/xattr.c b/fs/ext3/xattr.c
index d565759..431db28 100644
--- a/fs/ext3/xattr.c
+++ b/fs/ext3/xattr.c
@@ -1072,7 +1072,7 @@ retry:
 					      value, value_len, flags);
 		error2 = ext3_journal_stop(handle);
 		if (error == -ENOSPC &&
-		    ext3_should_retry_alloc(inode->i_sb, &retries))
+		    ext3_should_retry_alloc(inode->i_sb, &retries, 0))
 			goto retry;
 		if (error == 0)
 			error = error2;
diff --git a/include/linux/ext3_fs.h b/include/linux/ext3_fs.h
index 67a803a..34daec8 100644
--- a/include/linux/ext3_fs.h
+++ b/include/linux/ext3_fs.h
@@ -861,7 +861,8 @@ extern void ext3_check_blocks_bitmap (struct super_block *);
 extern struct ext3_group_desc * ext3_get_group_desc(struct super_block * sb,
 						    unsigned int block_group,
 						    struct buffer_head ** bh);
-extern int ext3_should_retry_alloc(struct super_block *sb, int *retries);
+extern int ext3_should_retry_alloc(struct super_block *sb, int *retries,
+				   int data);
 extern void ext3_init_block_alloc_info(struct inode *);
 extern void ext3_rsv_window_add(struct super_block *sb, struct ext3_reserve_window_node *rsv);
 extern int ext3_trim_fs(struct super_block *sb, struct fstrim_range *range);
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/5] ext4: use fs netlink interface for ENOSPC conditions
  2011-08-18 12:18 [RFC][PATCH 0/5] Add netlink file system notification interface Lukas Czerner
  2011-08-18 12:18 ` [PATCH 1/5] fs: add netlink " Lukas Czerner
  2011-08-18 12:18 ` [PATCH 2/5] ext3: use fs netlink interface for ENOSPC conditions Lukas Czerner
@ 2011-08-18 12:18 ` Lukas Czerner
  2011-08-18 12:18   ` Lukas Czerner
  2011-08-18 12:18 ` [PATCH 5/5] btrfs: " Lukas Czerner
  4 siblings, 0 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-18 12:18 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-fsdevel, Lukas Czerner, Theodore Ts'o, Ext4 Developers List

Register fs netlink interface and send proper warning if ENOSPC is
encountered. Note that we differentiate between enospc for metadata and
enospc for data.

Also fix ext4_should_retry_alloc() so we do not check for free blocks
when we are actually allocating metadata.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
CC: Theodore Ts'o <tytso@mit.edu>
CC: Ext4 Developers List <linux-ext4@vger.kernel.org>
---
 fs/ext4/acl.c      |    5 +++--
 fs/ext4/balloc.c   |   15 +++++++++++----
 fs/ext4/ext4.h     |    3 ++-
 fs/ext4/extents.c  |    2 +-
 fs/ext4/indirect.c |    2 +-
 fs/ext4/inode.c    |   10 +++++-----
 fs/ext4/namei.c    |   10 +++++-----
 fs/ext4/super.c    |    1 +
 fs/ext4/xattr.c    |    2 +-
 9 files changed, 30 insertions(+), 20 deletions(-)

diff --git a/fs/ext4/acl.c b/fs/ext4/acl.c
index a5c29bb..cc3e72c 100644
--- a/fs/ext4/acl.c
+++ b/fs/ext4/acl.c
@@ -321,7 +321,7 @@ retry:
 	error = ext4_set_acl(handle, inode, ACL_TYPE_ACCESS, acl);
 	ext4_journal_stop(handle);
 	if (error == -ENOSPC &&
-	    ext4_should_retry_alloc(inode->i_sb, &retries))
+	    ext4_should_retry_alloc(inode->i_sb, &retries, 0))
 		goto retry;
 out:
 	posix_acl_release(acl);
@@ -414,7 +414,8 @@ retry:
 		return PTR_ERR(handle);
 	error = ext4_set_acl(handle, inode, type, acl);
 	ext4_journal_stop(handle);
-	if (error == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
+	if (error == -ENOSPC && ext4_should_retry_alloc(inode->i_sb,
+							&retries, 0))
 		goto retry;
 
 release_and_out:
diff --git a/fs/ext4/balloc.c b/fs/ext4/balloc.c
index f8224ad..42daae7 100644
--- a/fs/ext4/balloc.c
+++ b/fs/ext4/balloc.c
@@ -426,12 +426,19 @@ int ext4_claim_free_blocks(struct ext4_sb_info *sbi,
  *
  * if the total number of retries exceed three times, return FALSE.
  */
-int ext4_should_retry_alloc(struct super_block *sb, int *retries)
+int ext4_should_retry_alloc(struct super_block *sb, int *retries, int data)
 {
-	if (!ext4_has_free_blocks(EXT4_SB(sb), 1, 0) ||
-	    (*retries)++ > 3 ||
-	    !EXT4_SB(sb)->s_journal)
+	if ((data && !ext4_has_free_blocks(EXT4_SB(sb), 1, 0)) ||
+	    ++(*retries) >= 3 ||
+	    !EXT4_SB(sb)->s_journal) {
+		if (data)
+			fs_nl_send_warning(sb->s_dev, FS_NL_ENOSPC_WARN);
+		else
+			fs_nl_send_warning(sb->s_dev, FS_NL_META_ENOSPC_WARN);
 		return 0;
+	}
+
+
 
 	jbd_debug(1, "%s: retrying operation after ENOSPC\n", sb->s_id);
 
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index e717dfd..45cb981 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1741,7 +1741,8 @@ extern void ext4_check_blocks_bitmap(struct super_block *);
 extern struct ext4_group_desc * ext4_get_group_desc(struct super_block * sb,
 						    ext4_group_t block_group,
 						    struct buffer_head ** bh);
-extern int ext4_should_retry_alloc(struct super_block *sb, int *retries);
+extern int ext4_should_retry_alloc(struct super_block *sb, int *retries,
+				   int data);
 struct buffer_head *ext4_read_block_bitmap(struct super_block *sb,
 				      ext4_group_t block_group);
 extern unsigned ext4_init_block_bitmap(struct super_block *sb,
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 57cf568..1c8cb04 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3834,7 +3834,7 @@ retry:
 			break;
 	}
 	if (ret == -ENOSPC &&
-			ext4_should_retry_alloc(inode->i_sb, &retries)) {
+			ext4_should_retry_alloc(inode->i_sb, &retries, 1)) {
 		ret = 0;
 		goto retry;
 	}
diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c
index b8602cd..40153bc 100644
--- a/fs/ext4/indirect.c
+++ b/fs/ext4/indirect.c
@@ -817,7 +817,7 @@ retry:
 				ext4_truncate_failed_write(inode);
 		}
 	}
-	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
+	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries, 1))
 		goto retry;
 
 	if (orphan) {
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d47264c..d0596f3 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -814,7 +814,7 @@ retry:
 		}
 	}
 
-	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
+	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries, 1))
 		goto retry;
 out:
 	return ret;
@@ -1067,7 +1067,7 @@ repeat:
 	 */
 	if (ext4_claim_free_blocks(sbi, md_needed + 1, 0)) {
 		dquot_release_reservation_block(inode, 1);
-		if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
+		if (ext4_should_retry_alloc(inode->i_sb, &retries, 1)) {
 			yield();
 			goto repeat;
 		}
@@ -2291,7 +2291,7 @@ retry:
 			ext4_truncate_failed_write(inode);
 	}
 
-	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
+	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries, 1))
 		goto retry;
 out:
 	return ret;
@@ -4349,7 +4349,7 @@ int ext4_page_mkwrite(struct vm_area_struct *vma, struct vm_fault *vmf)
 			ret = __block_page_mkwrite(vma, vmf,
 						   ext4_da_get_block_prep);
 		} while (ret == -ENOSPC &&
-		       ext4_should_retry_alloc(inode->i_sb, &retries));
+		       ext4_should_retry_alloc(inode->i_sb, &retries, 1));
 		goto out_ret;
 	}
 
@@ -4402,7 +4402,7 @@ retry_alloc:
 		ext4_set_inode_state(inode, EXT4_STATE_JDATA);
 	}
 	ext4_journal_stop(handle);
-	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
+	if (ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries, 1))
 		goto retry_alloc;
 out_ret:
 	ret = block_page_mkwrite_return(ret);
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index f8068c7..5a2fe5e 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1764,7 +1764,7 @@ retry:
 		err = ext4_add_nondir(handle, dentry, inode);
 	}
 	ext4_journal_stop(handle);
-	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 }
@@ -1801,7 +1801,7 @@ retry:
 		err = ext4_add_nondir(handle, dentry, inode);
 	}
 	ext4_journal_stop(handle);
-	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 }
@@ -1886,7 +1886,7 @@ out_clear_inode:
 out_stop:
 	brelse(dir_block);
 	ext4_journal_stop(handle);
-	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 }
@@ -2333,7 +2333,7 @@ retry:
 	err = ext4_add_nondir(handle, dentry, inode);
 out_stop:
 	ext4_journal_stop(handle);
-	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 err_drop_inode:
@@ -2376,7 +2376,7 @@ retry:
 		iput(inode);
 	}
 	ext4_journal_stop(handle);
-	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries))
+	if (err == -ENOSPC && ext4_should_retry_alloc(dir->i_sb, &retries, 0))
 		goto retry;
 	return err;
 }
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 4687fea..83b7ccd 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5011,6 +5011,7 @@ static int __init ext4_init_fs(void)
 
 	ext4_li_info = NULL;
 	mutex_init(&ext4_li_mtx);
+	init_fs_nl_family();
 	return 0;
 out:
 	unregister_as_ext2();
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index c757adc..a3e381b 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -1094,7 +1094,7 @@ retry:
 					      value, value_len, flags);
 		error2 = ext4_journal_stop(handle);
 		if (error == -ENOSPC &&
-		    ext4_should_retry_alloc(inode->i_sb, &retries))
+		    ext4_should_retry_alloc(inode->i_sb, &retries, 0))
 			goto retry;
 		if (error == 0)
 			error = error2;
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/5] xfs: use fs netlink interface for ENOSPC conditions
  2011-08-18 12:18 [RFC][PATCH 0/5] Add netlink file system notification interface Lukas Czerner
@ 2011-08-18 12:18   ` Lukas Czerner
  2011-08-18 12:18 ` [PATCH 2/5] ext3: use fs netlink interface for ENOSPC conditions Lukas Czerner
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-18 12:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, Lukas Czerner, Christoph Hellwig, xfs

Register fs netlink interface and send proper warning if ENOSPC is
encountered. Note that we differentiate between enospc for metadata and
enospc for data.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: xfs@oss.sgi.com
---
 fs/xfs/linux-2.6/xfs_file.c  |    2 ++
 fs/xfs/linux-2.6/xfs_super.c |    1 +
 fs/xfs/xfs_vnodeops.c        |   10 ++++++++--
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c
index 7f7b424..ab41603 100644
--- a/fs/xfs/linux-2.6/xfs_file.c
+++ b/fs/xfs/linux-2.6/xfs_file.c
@@ -896,6 +896,8 @@ xfs_file_aio_write(
 out_unlock:
 	xfs_aio_write_newsize_update(ip);
 	xfs_rw_iunlock(ip, iolock);
+	if (unlikely(-ENOSPC == ret))
+		fs_nl_send_warning(inode->i_sb->s_dev, FS_NL_ENOSPC_WARN);
 	return ret;
 }
 
diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 9a72dda..dd167a0 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -1731,6 +1731,7 @@ init_xfs_fs(void)
 	error = register_filesystem(&xfs_fs_type);
 	if (error)
 		goto out_sysctl_unregister;
+	init_fs_nl_family();
 	return 0;
 
  out_sysctl_unregister:
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index 51fc429..1a95115 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -996,6 +996,8 @@ xfs_create(
  out_trans_abort:
 	cancel_flags |= XFS_TRANS_ABORT;
  out_trans_cancel:
+	if (ENOSPC == error)
+		fs_nl_send_warning(mp->m_super->s_dev, FS_NL_META_ENOSPC_WARN);
 	xfs_trans_cancel(tp, cancel_flags);
  out_release_inode:
 	/*
@@ -2302,7 +2304,7 @@ xfs_change_file_space(
 		error = xfs_alloc_file_space(ip, startoffset, bf->l_len,
 						prealloc_type, attr_flags);
 		if (error)
-			return error;
+			goto alloc_failed;
 		setprealloc = 1;
 		break;
 
@@ -2321,7 +2323,7 @@ xfs_change_file_space(
 			error = xfs_alloc_file_space(ip, fsize,
 					startoffset - fsize, 0, attr_flags);
 			if (error)
-				break;
+				goto alloc_failed;
 		}
 
 		iattr.ia_valid = ATTR_SIZE;
@@ -2384,5 +2386,9 @@ xfs_change_file_space(
 
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 
+alloc_failed:
+	if (unlikely(ENOSPC == error))
+		fs_nl_send_warning(mp->m_super->s_dev, FS_NL_ENOSPC_WARN);
+
 	return error;
 }
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/5] xfs: use fs netlink interface for ENOSPC conditions
@ 2011-08-18 12:18   ` Lukas Czerner
  0 siblings, 0 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-18 12:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, Lukas Czerner, Christoph Hellwig, xfs

Register fs netlink interface and send proper warning if ENOSPC is
encountered. Note that we differentiate between enospc for metadata and
enospc for data.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
CC: Christoph Hellwig <hch@lst.de>
CC: xfs@oss.sgi.com
---
 fs/xfs/linux-2.6/xfs_file.c  |    2 ++
 fs/xfs/linux-2.6/xfs_super.c |    1 +
 fs/xfs/xfs_vnodeops.c        |   10 ++++++++--
 3 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_file.c b/fs/xfs/linux-2.6/xfs_file.c
index 7f7b424..ab41603 100644
--- a/fs/xfs/linux-2.6/xfs_file.c
+++ b/fs/xfs/linux-2.6/xfs_file.c
@@ -896,6 +896,8 @@ xfs_file_aio_write(
 out_unlock:
 	xfs_aio_write_newsize_update(ip);
 	xfs_rw_iunlock(ip, iolock);
+	if (unlikely(-ENOSPC == ret))
+		fs_nl_send_warning(inode->i_sb->s_dev, FS_NL_ENOSPC_WARN);
 	return ret;
 }
 
diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 9a72dda..dd167a0 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -1731,6 +1731,7 @@ init_xfs_fs(void)
 	error = register_filesystem(&xfs_fs_type);
 	if (error)
 		goto out_sysctl_unregister;
+	init_fs_nl_family();
 	return 0;
 
  out_sysctl_unregister:
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index 51fc429..1a95115 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -996,6 +996,8 @@ xfs_create(
  out_trans_abort:
 	cancel_flags |= XFS_TRANS_ABORT;
  out_trans_cancel:
+	if (ENOSPC == error)
+		fs_nl_send_warning(mp->m_super->s_dev, FS_NL_META_ENOSPC_WARN);
 	xfs_trans_cancel(tp, cancel_flags);
  out_release_inode:
 	/*
@@ -2302,7 +2304,7 @@ xfs_change_file_space(
 		error = xfs_alloc_file_space(ip, startoffset, bf->l_len,
 						prealloc_type, attr_flags);
 		if (error)
-			return error;
+			goto alloc_failed;
 		setprealloc = 1;
 		break;
 
@@ -2321,7 +2323,7 @@ xfs_change_file_space(
 			error = xfs_alloc_file_space(ip, fsize,
 					startoffset - fsize, 0, attr_flags);
 			if (error)
-				break;
+				goto alloc_failed;
 		}
 
 		iattr.ia_valid = ATTR_SIZE;
@@ -2384,5 +2386,9 @@ xfs_change_file_space(
 
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 
+alloc_failed:
+	if (unlikely(ENOSPC == error))
+		fs_nl_send_warning(mp->m_super->s_dev, FS_NL_ENOSPC_WARN);
+
 	return error;
 }
-- 
1.7.4.4

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/5] btrfs: use fs netlink interface for ENOSPC conditions
  2011-08-18 12:18 [RFC][PATCH 0/5] Add netlink file system notification interface Lukas Czerner
                   ` (3 preceding siblings ...)
  2011-08-18 12:18   ` Lukas Czerner
@ 2011-08-18 12:18 ` Lukas Czerner
  2011-08-18 18:22   ` David Sterba
  4 siblings, 1 reply; 17+ messages in thread
From: Lukas Czerner @ 2011-08-18 12:18 UTC (permalink / raw)
  To: linux-kernel; +Cc: linux-fsdevel, Lukas Czerner, Chris Mason, linux-btrfs

Register fs netlink interface and send proper warning if ENOSPC is
encountered. Note that we differentiate between enospc for metadata and
enospc for data.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Cc: Chris Mason <chris.mason@oracle.com>
Cc: linux-btrfs@vger.kernel.org
---
 fs/btrfs/extent-tree.c |   13 ++++++++++---
 fs/btrfs/super.c       |    1 +
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 66bac22..a47d9b9 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -3029,13 +3029,14 @@ int btrfs_check_data_free_space(struct inode *inode, u64 bytes)
 {
 	struct btrfs_space_info *data_sinfo;
 	struct btrfs_root *root = BTRFS_I(inode)->root;
+	struct btrfs_fs_info *fs_info = root->fs_info;
 	u64 used;
 	int ret = 0, committed = 0, alloc_chunk = 1;
 
 	/* make sure bytes are sectorsize aligned */
 	bytes = (bytes + root->sectorsize - 1) & ~((u64)root->sectorsize - 1);
 
-	if (root == root->fs_info->tree_root ||
+	if (root == fs_info->tree_root ||
 	    BTRFS_I(inode)->location.objectid == BTRFS_FREE_INO_OBJECTID) {
 		alloc_chunk = 0;
 		committed = 1;
@@ -3070,7 +3071,7 @@ alloc:
 			if (IS_ERR(trans))
 				return PTR_ERR(trans);
 
-			ret = do_chunk_alloc(trans, root->fs_info->extent_root,
+			ret = do_chunk_alloc(trans, fs_info->extent_root,
 					     bytes + 2 * 1024 * 1024,
 					     alloc_target,
 					     CHUNK_ALLOC_NO_FORCE);
@@ -3100,7 +3101,7 @@ alloc:
 		/* commit the current transaction and try again */
 commit_trans:
 		if (!committed &&
-		    !atomic_read(&root->fs_info->open_ioctl_trans)) {
+		    !atomic_read(&fs_info->open_ioctl_trans)) {
 			committed = 1;
 			trans = btrfs_join_transaction(root);
 			if (IS_ERR(trans))
@@ -3111,6 +3112,8 @@ commit_trans:
 			goto again;
 		}
 
+		fs_nl_send_warning(fs_info->fs_devices->latest_bdev->bd_dev,
+				   FS_NL_ENOSPC_WARN);
 		return -ENOSPC;
 	}
 	data_sinfo->bytes_may_use += bytes;
@@ -3522,6 +3525,10 @@ again:
 	}
 
 out:
+	if (unlikely(-ENOSPC == ret)) {
+		dev_t bdev = root->fs_info->fs_devices->latest_bdev->bd_dev;
+		fs_nl_send_warning(bdev, FS_NL_META_ENOSPC_WARN);
+	}
 	if (flushing) {
 		spin_lock(&space_info->lock);
 		space_info->flush = 0;
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 15634d4..8ac9e01 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1266,6 +1266,7 @@ static int __init init_btrfs_fs(void)
 	if (err)
 		goto unregister_ioctl;
 
+	init_fs_nl_family();
 	printk(KERN_INFO "%s loaded\n", BTRFS_BUILD_VERSION);
 	return 0;
 
-- 
1.7.4.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] fs: add netlink notification interface
  2011-08-18 12:18 ` [PATCH 1/5] fs: add netlink " Lukas Czerner
@ 2011-08-18 15:00   ` Randy Dunlap
  2011-08-18 15:42     ` Lukas Czerner
  2011-08-19 15:15   ` Ted Ts'o
  1 sibling, 1 reply; 17+ messages in thread
From: Randy Dunlap @ 2011-08-18 15:00 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-kernel, linux-fsdevel, Alexander Viro

On Thu, 18 Aug 2011 14:18:22 +0200 Lukas Czerner wrote:


>  fs/Makefile        |    2 +-
>  fs/netlink.c       |  107 ++++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/fs.h |   11 +++++
>  3 files changed, 119 insertions(+), 1 deletions(-)
>  create mode 100644 fs/netlink.c

Hi,

Does this build OK when CONFIG_NET is not enabled?

and also for kernels that do not have CONFIG_NET enabled:  are these
new netlink messages basically duplicates of printk() calls that are
already there and will remain there?

Thanks.

> diff --git a/fs/Makefile b/fs/Makefile
> index afc1096..7e9c61f 100644
> --- a/fs/Makefile
> +++ b/fs/Makefile
> @@ -11,7 +11,7 @@ obj-y :=	open.o read_write.o file_table.o super.o \
>  		attr.o bad_inode.o file.o filesystems.o namespace.o \
>  		seq_file.o xattr.o libfs.o fs-writeback.o \
>  		pnode.o drop_caches.o splice.o sync.o utimes.o \
> -		stack.o fs_struct.o statfs.o
> +		stack.o fs_struct.o statfs.o netlink.o
>  
>  ifeq ($(CONFIG_BLOCK),y)
>  obj-y +=	buffer.o bio.o block_dev.o direct-io.o mpage.o ioprio.o
> diff --git a/fs/netlink.c b/fs/netlink.c
> new file mode 100644
> index 0000000..15c44a1
> --- /dev/null
> +++ b/fs/netlink.c
> @@ -0,0 +1,107 @@
> +#include <linux/fs.h>
> +#include <linux/cred.h>
> +#include <linux/init.h>
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +#include <linux/sched.h>
> +#include <linux/slab.h>
> +#include <net/netlink.h>
> +#include <net/genetlink.h>
> +
> +enum {
> +	FS_NL_A_UNSPEC,
> +	FS_NL_A_WARNING,
> +	FS_NL_A_DEV_MAJOR,
> +	FS_NL_A_DEV_MINOR,
> +	FS_NL_A_CAUSED_ID,
> +	__FS_NL_A_MAX,
> +};
> +#define FS_NL_A_MAX (__FS_NL_A_MAX - 1)
> +
> +enum {
> +	FS_NL_C_UNSPEC,
> +	FS_NL_C_WARNING,
> +	__FS_NL_C_MAX,
> +};
> +#define FS_NL_C_MAX (__FS_NL_C_MAX - 1)
> +
> +
> +static struct genl_family fs_genl_family = {
> +	.id = GENL_ID_GENERATE,
> +	.hdrsize = 0,
> +	.name = "FS_MSG",
> +	.version = 1,
> +	.maxattr = FS_NL_A_MAX,
> +};
> +static int registered;
> +
> +/**
> + * fs_nl_send_warning - Send warning from file system to userspace
> + * @dev: The device on which the fs is mounted
> + * @warntype: The type of the warning
> + *
> + * This can be used by file systems to send a warning message to the
> + * userspace.
> + */
> +
> +void fs_nl_send_warning(dev_t dev, unsigned int warntype)
> +{
> +	static atomic_t seq;
> +	struct sk_buff *skb;
> +	void *msg_head;
> +	int ret;
> +	int msg_size = nla_total_size(sizeof(u64)) +
> +			3 * nla_total_size(sizeof(u32));
> +
> +	/* We have to allocate using GFP_NOFS as we are called from a
> +	 * filesystem performing write and thus further recursion into
> +	 * the fs to free some data could cause deadlocks. */
> +	skb = genlmsg_new(msg_size, GFP_NOFS);
> +	if (!skb) {
> +		printk(KERN_ERR
> +		  "VFS: Not enough memory to send fs warning.\n");
> +		return;
> +	}
> +	msg_head = genlmsg_put(skb, 0, atomic_add_return(1, &seq),
> +			&fs_genl_family, 0, FS_NL_C_WARNING);
> +	if (!msg_head) {
> +		printk(KERN_ERR
> +		  "VFS: Cannot store netlink header in fs warning.\n");
> +		goto err_out;
> +	}
> +	ret = nla_put_u32(skb, FS_NL_A_WARNING, warntype);
> +	if (ret)
> +		goto attr_err_out;
> +	ret = nla_put_u32(skb, FS_NL_A_DEV_MAJOR, MAJOR(dev));
> +	if (ret)
> +		goto attr_err_out;
> +	ret = nla_put_u32(skb, FS_NL_A_DEV_MINOR, MINOR(dev));
> +	if (ret)
> +		goto attr_err_out;
> +	ret = nla_put_u64(skb, FS_NL_A_CAUSED_ID, current_uid());
> +	if (ret)
> +		goto attr_err_out;
> +	genlmsg_end(skb, msg_head);
> +	genlmsg_multicast(skb, 0, fs_genl_family.id, GFP_NOFS);
> +	return;
> +attr_err_out:
> +	printk(KERN_ERR "VFS: Not enough space to compose "
> +			"fs netlink message!\n");
> +err_out:
> +	kfree_skb(skb);
> +}
> +EXPORT_SYMBOL(fs_nl_send_warning);
> +
> +void init_fs_nl_family(void)
> +{
> +	if (registered)
> +		return;
> +
> +	if (genl_register_family(&fs_genl_family) != 0) {
> +		printk(KERN_ERR
> +		       "VFS: Failed to create fs netlink interface.\n");
> +		return;
> +	}
> +	registered = 1;
> +}
> +EXPORT_SYMBOL(init_fs_nl_family);


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] fs: add netlink notification interface
  2011-08-18 15:00   ` Randy Dunlap
@ 2011-08-18 15:42     ` Lukas Czerner
  0 siblings, 0 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-18 15:42 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: Lukas Czerner, linux-kernel, linux-fsdevel, Alexander Viro

On Thu, 18 Aug 2011, Randy Dunlap wrote:

> On Thu, 18 Aug 2011 14:18:22 +0200 Lukas Czerner wrote:
> 
> 
> >  fs/Makefile        |    2 +-
> >  fs/netlink.c       |  107 ++++++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/fs.h |   11 +++++
> >  3 files changed, 119 insertions(+), 1 deletions(-)
> >  create mode 100644 fs/netlink.c
> 
> Hi,
> 
> Does this build OK when CONFIG_NET is not enabled?

Good point :), it does not. I'll fix that once I have more comments.

> 
> and also for kernels that do not have CONFIG_NET enabled:  are these
> new netlink messages basically duplicates of printk() calls that are
> already there and will remain there?

No they are not. There is not a single printk I am duplicating right
now.

Thanks!
-Lukas

> 
> Thanks.
> 
> > diff --git a/fs/Makefile b/fs/Makefile
> > index afc1096..7e9c61f 100644
> > --- a/fs/Makefile
> > +++ b/fs/Makefile
> > @@ -11,7 +11,7 @@ obj-y :=	open.o read_write.o file_table.o super.o \
> >  		attr.o bad_inode.o file.o filesystems.o namespace.o \
> >  		seq_file.o xattr.o libfs.o fs-writeback.o \
> >  		pnode.o drop_caches.o splice.o sync.o utimes.o \
> > -		stack.o fs_struct.o statfs.o
> > +		stack.o fs_struct.o statfs.o netlink.o
> >  
> >  ifeq ($(CONFIG_BLOCK),y)
> >  obj-y +=	buffer.o bio.o block_dev.o direct-io.o mpage.o ioprio.o
> > diff --git a/fs/netlink.c b/fs/netlink.c
> > new file mode 100644
> > index 0000000..15c44a1
> > --- /dev/null
> > +++ b/fs/netlink.c
> > @@ -0,0 +1,107 @@
> > +#include <linux/fs.h>
> > +#include <linux/cred.h>
> > +#include <linux/init.h>
> > +#include <linux/module.h>
> > +#include <linux/kernel.h>
> > +#include <linux/sched.h>
> > +#include <linux/slab.h>
> > +#include <net/netlink.h>
> > +#include <net/genetlink.h>
> > +
> > +enum {
> > +	FS_NL_A_UNSPEC,
> > +	FS_NL_A_WARNING,
> > +	FS_NL_A_DEV_MAJOR,
> > +	FS_NL_A_DEV_MINOR,
> > +	FS_NL_A_CAUSED_ID,
> > +	__FS_NL_A_MAX,
> > +};
> > +#define FS_NL_A_MAX (__FS_NL_A_MAX - 1)
> > +
> > +enum {
> > +	FS_NL_C_UNSPEC,
> > +	FS_NL_C_WARNING,
> > +	__FS_NL_C_MAX,
> > +};
> > +#define FS_NL_C_MAX (__FS_NL_C_MAX - 1)
> > +
> > +
> > +static struct genl_family fs_genl_family = {
> > +	.id = GENL_ID_GENERATE,
> > +	.hdrsize = 0,
> > +	.name = "FS_MSG",
> > +	.version = 1,
> > +	.maxattr = FS_NL_A_MAX,
> > +};
> > +static int registered;
> > +
> > +/**
> > + * fs_nl_send_warning - Send warning from file system to userspace
> > + * @dev: The device on which the fs is mounted
> > + * @warntype: The type of the warning
> > + *
> > + * This can be used by file systems to send a warning message to the
> > + * userspace.
> > + */
> > +
> > +void fs_nl_send_warning(dev_t dev, unsigned int warntype)
> > +{
> > +	static atomic_t seq;
> > +	struct sk_buff *skb;
> > +	void *msg_head;
> > +	int ret;
> > +	int msg_size = nla_total_size(sizeof(u64)) +
> > +			3 * nla_total_size(sizeof(u32));
> > +
> > +	/* We have to allocate using GFP_NOFS as we are called from a
> > +	 * filesystem performing write and thus further recursion into
> > +	 * the fs to free some data could cause deadlocks. */
> > +	skb = genlmsg_new(msg_size, GFP_NOFS);
> > +	if (!skb) {
> > +		printk(KERN_ERR
> > +		  "VFS: Not enough memory to send fs warning.\n");
> > +		return;
> > +	}
> > +	msg_head = genlmsg_put(skb, 0, atomic_add_return(1, &seq),
> > +			&fs_genl_family, 0, FS_NL_C_WARNING);
> > +	if (!msg_head) {
> > +		printk(KERN_ERR
> > +		  "VFS: Cannot store netlink header in fs warning.\n");
> > +		goto err_out;
> > +	}
> > +	ret = nla_put_u32(skb, FS_NL_A_WARNING, warntype);
> > +	if (ret)
> > +		goto attr_err_out;
> > +	ret = nla_put_u32(skb, FS_NL_A_DEV_MAJOR, MAJOR(dev));
> > +	if (ret)
> > +		goto attr_err_out;
> > +	ret = nla_put_u32(skb, FS_NL_A_DEV_MINOR, MINOR(dev));
> > +	if (ret)
> > +		goto attr_err_out;
> > +	ret = nla_put_u64(skb, FS_NL_A_CAUSED_ID, current_uid());
> > +	if (ret)
> > +		goto attr_err_out;
> > +	genlmsg_end(skb, msg_head);
> > +	genlmsg_multicast(skb, 0, fs_genl_family.id, GFP_NOFS);
> > +	return;
> > +attr_err_out:
> > +	printk(KERN_ERR "VFS: Not enough space to compose "
> > +			"fs netlink message!\n");
> > +err_out:
> > +	kfree_skb(skb);
> > +}
> > +EXPORT_SYMBOL(fs_nl_send_warning);
> > +
> > +void init_fs_nl_family(void)
> > +{
> > +	if (registered)
> > +		return;
> > +
> > +	if (genl_register_family(&fs_genl_family) != 0) {
> > +		printk(KERN_ERR
> > +		       "VFS: Failed to create fs netlink interface.\n");
> > +		return;
> > +	}
> > +	registered = 1;
> > +}
> > +EXPORT_SYMBOL(init_fs_nl_family);
> 
> 
> ---
> ~Randy
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
> 

-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] btrfs: use fs netlink interface for ENOSPC conditions
  2011-08-18 12:18 ` [PATCH 5/5] btrfs: " Lukas Czerner
@ 2011-08-18 18:22   ` David Sterba
  2011-08-18 19:03     ` Lukas Czerner
  0 siblings, 1 reply; 17+ messages in thread
From: David Sterba @ 2011-08-18 18:22 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-kernel, linux-fsdevel, Chris Mason, linux-btrfs

Hi,

I see you are mixing a cleanup while adding a new feature.

On Thu, Aug 18, 2011 at 02:18:26PM +0200, Lukas Czerner wrote:
> Register fs netlink interface and send proper warning if ENOSPC is
> encountered. Note that we differentiate between enospc for metadata and
> enospc for data.
> 
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> Cc: Chris Mason <chris.mason@oracle.com>
> Cc: linux-btrfs@vger.kernel.org
> ---
>  fs/btrfs/extent-tree.c |   13 ++++++++++---
>  fs/btrfs/super.c       |    1 +
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 66bac22..a47d9b9 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3029,13 +3029,14 @@ int btrfs_check_data_free_space(struct inode *inode, u64 bytes)

hm, if diff is correct with the context, you are changing function
btrfs_check_data_free_space and reporting it as metadata ENOSPC later.
Although it's called for metadata (or I should say non-user file blocks,
like space cache, ino cache) block allocation, it's called from
btrfs_fallocate too. Seems that there has to be additional flag to say
if it's really metadata or data.

>  {
>  	struct btrfs_space_info *data_sinfo;
>  	struct btrfs_root *root = BTRFS_I(inode)->root;
> +	struct btrfs_fs_info *fs_info = root->fs_info;

change unrealated to ENOSPC-netlink

>  	u64 used;
>  	int ret = 0, committed = 0, alloc_chunk = 1;
>  
>  	/* make sure bytes are sectorsize aligned */
>  	bytes = (bytes + root->sectorsize - 1) & ~((u64)root->sectorsize - 1);
>  
> -	if (root == root->fs_info->tree_root ||
> +	if (root == fs_info->tree_root ||

here

>  	    BTRFS_I(inode)->location.objectid == BTRFS_FREE_INO_OBJECTID) {
>  		alloc_chunk = 0;
>  		committed = 1;
> @@ -3070,7 +3071,7 @@ alloc:
>  			if (IS_ERR(trans))
>  				return PTR_ERR(trans);
>  
> -			ret = do_chunk_alloc(trans, root->fs_info->extent_root,
> +			ret = do_chunk_alloc(trans, fs_info->extent_root,

here

>  					     bytes + 2 * 1024 * 1024,
>  					     alloc_target,
>  					     CHUNK_ALLOC_NO_FORCE);
> @@ -3100,7 +3101,7 @@ alloc:
>  		/* commit the current transaction and try again */
>  commit_trans:
>  		if (!committed &&
> -		    !atomic_read(&root->fs_info->open_ioctl_trans)) {
> +		    !atomic_read(&fs_info->open_ioctl_trans)) {

here

>  			committed = 1;
>  			trans = btrfs_join_transaction(root);
>  			if (IS_ERR(trans))
> @@ -3111,6 +3112,8 @@ commit_trans:
>  			goto again;
>  		}
>  
> +		fs_nl_send_warning(fs_info->fs_devices->latest_bdev->bd_dev,
> +				   FS_NL_ENOSPC_WARN);

or is it due to this line being too long with root-> ? :)

>  		return -ENOSPC;
>  	}
>  	data_sinfo->bytes_may_use += bytes;
> @@ -3522,6 +3525,10 @@ again:
>  	}
>  
>  out:
> +	if (unlikely(-ENOSPC == ret)) {

'unlikely' is not needed here, it does not bring anything compiler
wouldn't know, static branch prediction will give low probabiliy to this
check anyway

> +		dev_t bdev = root->fs_info->fs_devices->latest_bdev->bd_dev;
> +		fs_nl_send_warning(bdev, FS_NL_META_ENOSPC_WARN);
> +	}
>  	if (flushing) {
>  		spin_lock(&space_info->lock);
>  		space_info->flush = 0;
> diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> index 15634d4..8ac9e01 100644
> --- a/fs/btrfs/super.c
> +++ b/fs/btrfs/super.c
> @@ -1266,6 +1266,7 @@ static int __init init_btrfs_fs(void)
>  	if (err)
>  		goto unregister_ioctl;
>  
> +	init_fs_nl_family();
>  	printk(KERN_INFO "%s loaded\n", BTRFS_BUILD_VERSION);
>  	return 0;


david

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] btrfs: use fs netlink interface for ENOSPC conditions
  2011-08-18 18:22   ` David Sterba
@ 2011-08-18 19:03     ` Lukas Czerner
  0 siblings, 0 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-18 19:03 UTC (permalink / raw)
  To: David Sterba
  Cc: Lukas Czerner, linux-kernel, linux-fsdevel, Chris Mason, linux-btrfs

On Thu, 18 Aug 2011, David Sterba wrote:

> Hi,
> 
> I see you are mixing a cleanup while adding a new feature.
> 
> On Thu, Aug 18, 2011 at 02:18:26PM +0200, Lukas Czerner wrote:
> > Register fs netlink interface and send proper warning if ENOSPC is
> > encountered. Note that we differentiate between enospc for metadata and
> > enospc for data.
> > 
> > Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> > Cc: Chris Mason <chris.mason@oracle.com>
> > Cc: linux-btrfs@vger.kernel.org
> > ---
> >  fs/btrfs/extent-tree.c |   13 ++++++++++---
> >  fs/btrfs/super.c       |    1 +
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> > 
> > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> > index 66bac22..a47d9b9 100644
> > --- a/fs/btrfs/extent-tree.c
> > +++ b/fs/btrfs/extent-tree.c
> > @@ -3029,13 +3029,14 @@ int btrfs_check_data_free_space(struct inode *inode, u64 bytes)
> 
> hm, if diff is correct with the context, you are changing function
> btrfs_check_data_free_space and reporting it as metadata ENOSPC later.
> Although it's called for metadata (or I should say non-user file blocks,
> like space cache, ino cache) block allocation, it's called from
> btrfs_fallocate too. Seems that there has to be additional flag to say
> if it's really metadata or data.

Actually diff got the context wrong. Sometimes it has some difficulties
if there are labels involved.

> 
> >  {
> >  	struct btrfs_space_info *data_sinfo;
> >  	struct btrfs_root *root = BTRFS_I(inode)->root;
> > +	struct btrfs_fs_info *fs_info = root->fs_info;
> 
> change unrealated to ENOSPC-netlink

Not really. I am using fs_info to get reference to the last_bdev and
it is useful for other places as well.

> 
> >  	u64 used;
> >  	int ret = 0, committed = 0, alloc_chunk = 1;
> >  
> >  	/* make sure bytes are sectorsize aligned */
> >  	bytes = (bytes + root->sectorsize - 1) & ~((u64)root->sectorsize - 1);
> >  
> > -	if (root == root->fs_info->tree_root ||
> > +	if (root == fs_info->tree_root ||
> 
> here
> 
> >  	    BTRFS_I(inode)->location.objectid == BTRFS_FREE_INO_OBJECTID) {
> >  		alloc_chunk = 0;
> >  		committed = 1;
> > @@ -3070,7 +3071,7 @@ alloc:
> >  			if (IS_ERR(trans))
> >  				return PTR_ERR(trans);
> >  
> > -			ret = do_chunk_alloc(trans, root->fs_info->extent_root,
> > +			ret = do_chunk_alloc(trans, fs_info->extent_root,
> 
> here
> 
> >  					     bytes + 2 * 1024 * 1024,
> >  					     alloc_target,
> >  					     CHUNK_ALLOC_NO_FORCE);
> > @@ -3100,7 +3101,7 @@ alloc:
> >  		/* commit the current transaction and try again */
> >  commit_trans:
> >  		if (!committed &&
> > -		    !atomic_read(&root->fs_info->open_ioctl_trans)) {
> > +		    !atomic_read(&fs_info->open_ioctl_trans)) {
> 
> here
> 
> >  			committed = 1;
> >  			trans = btrfs_join_transaction(root);
> >  			if (IS_ERR(trans))
> > @@ -3111,6 +3112,8 @@ commit_trans:
> >  			goto again;
> >  		}
> >  
> > +		fs_nl_send_warning(fs_info->fs_devices->latest_bdev->bd_dev,
> > +				   FS_NL_ENOSPC_WARN);
> 
> or is it due to this line being too long with root-> ? :)

exactly :) But as I said fs_info is referenced on other paces as well so
it is useful anyway.

> 
> >  		return -ENOSPC;
> >  	}
> >  	data_sinfo->bytes_may_use += bytes;
> > @@ -3522,6 +3525,10 @@ again:
> >  	}
> >  
> >  out:
> > +	if (unlikely(-ENOSPC == ret)) {
> 
> 'unlikely' is not needed here, it does not bring anything compiler
> wouldn't know, static branch prediction will give low probabiliy to this
> check anyway

Ok, but I would like to know why do you think that. It is not only in
the error path and there are actually several paths to this condition so
maybe I am being dense, could you explain it a bit for me ?

Thanks!
-Lukas

> 
> > +		dev_t bdev = root->fs_info->fs_devices->latest_bdev->bd_dev;
> > +		fs_nl_send_warning(bdev, FS_NL_META_ENOSPC_WARN);
> > +	}
> >  	if (flushing) {
> >  		spin_lock(&space_info->lock);
> >  		space_info->flush = 0;
> > diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
> > index 15634d4..8ac9e01 100644
> > --- a/fs/btrfs/super.c
> > +++ b/fs/btrfs/super.c
> > @@ -1266,6 +1266,7 @@ static int __init init_btrfs_fs(void)
> >  	if (err)
> >  		goto unregister_ioctl;
> >  
> > +	init_fs_nl_family();
> >  	printk(KERN_INFO "%s loaded\n", BTRFS_BUILD_VERSION);
> >  	return 0;
> 
> 
> david
> 

-- 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] fs: add netlink notification interface
  2011-08-18 12:18 ` [PATCH 1/5] fs: add netlink " Lukas Czerner
  2011-08-18 15:00   ` Randy Dunlap
@ 2011-08-19 15:15   ` Ted Ts'o
  2011-08-23  9:03     ` Lukas Czerner
  1 sibling, 1 reply; 17+ messages in thread
From: Ted Ts'o @ 2011-08-19 15:15 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-kernel, linux-fsdevel, Alexander Viro

On Thu, Aug 18, 2011 at 02:18:22PM +0200, Lukas Czerner wrote:
> +void fs_nl_send_warning(dev_t dev, unsigned int warntype)
> +{

I'm pretty sure this interface isn't going to be sufficient in the
long run.  I can imagine that people might want to pass in a struct
inode *, or a struct file * if available, so we could pass back the
inode number involved, or even the pathname.

Also, I'd suggesting changing "warning" to "error", since that's what
most of the things we are sending really are...

> +	ret = nla_put_u32(skb, FS_NL_A_WARNING, warntype);
> +	if (ret)
> +		goto attr_err_out;

Why not make the warning type to be a string instead of an integer?
This would make it easier to extend the protocol in the future.

> +	ret = nla_put_u64(skb, FS_NL_A_CAUSED_ID, current_uid());
> +	if (ret)
> +		goto attr_err_out;

I'd also suggest passing back the process id that requested the I/O if
that's known...

						- Ted

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] fs: add netlink notification interface
  2011-08-19 15:15   ` Ted Ts'o
@ 2011-08-23  9:03     ` Lukas Czerner
  0 siblings, 0 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-23  9:03 UTC (permalink / raw)
  To: Ted Ts'o; +Cc: Lukas Czerner, linux-kernel, linux-fsdevel, Alexander Viro

On Fri, 19 Aug 2011, Ted Ts'o wrote:

> On Thu, Aug 18, 2011 at 02:18:22PM +0200, Lukas Czerner wrote:
> > +void fs_nl_send_warning(dev_t dev, unsigned int warntype)
> > +{
> 
> I'm pretty sure this interface isn't going to be sufficient in the
> long run.  I can imagine that people might want to pass in a struct
> inode *, or a struct file * if available, so we could pass back the
> inode number involved, or even the pathname.

Ok, it makes sense to extend the interface a bit. So what about:

voif fs_nl_send_warning(dev_t dev, unsigned int warntype,
			unsigned int flags, void *data);

So we can get the type of the warning, flags which will provide more
information about the waring as for example what type of data are we
attaching, or what type of quota is that etc... And with data we can
easily pass struct inode *, or struct file * or whatever the type of
warning needs for better description of the problem.

> 
> Also, I'd suggesting changing "warning" to "error", since that's what
> most of the things we are sending really are...

I am ok with calling it error, but I do not consider ENOSPC, or
exceeding quota to be an error from the kernel point of view. In
application it is of course an error, but in file system it is just the
state of the things. But anyway it is just my viewpoint and if people
think that it should be called *error* i am ok with that.

> 
> > +	ret = nla_put_u32(skb, FS_NL_A_WARNING, warntype);
> > +	if (ret)
> > +		goto attr_err_out;
> 
> Why not make the warning type to be a string instead of an integer?
> This would make it easier to extend the protocol in the future.

I think that having an ID of the type of the message is important so we
can clearly say what happened without the need of parsing the string.
Otherwise it would be no different than printk's.

I am not sure why would we want to pass string instead. If we want to
pass random errors down this interface I do not think that is going to
be very useful.

> 
> > +	ret = nla_put_u64(skb, FS_NL_A_CAUSED_ID, current_uid());
> > +	if (ret)
> > +		goto attr_err_out;
> 
> I'd also suggest passing back the process id that requested the I/O if
> that's known...

Good point.

> 
> 						- Ted
> 

Thanks!
-Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] xfs: use fs netlink interface for ENOSPC conditions
  2011-08-18 12:18   ` Lukas Czerner
@ 2011-08-25  5:16     ` Christoph Hellwig
  -1 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2011-08-25  5:16 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-kernel, linux-fsdevel, Christoph Hellwig, xfs

>  	xfs_rw_iunlock(ip, iolock);
> +	if (unlikely(-ENOSPC == ret))
> +		fs_nl_send_warning(inode->i_sb->s_dev, FS_NL_ENOSPC_WARN);

I'd remove the nl from both the name and the constants.  In the end what
matters is the warning, and netlink just is an implementation detail.

> index 9a72dda..dd167a0 100644
> --- a/fs/xfs/linux-2.6/xfs_super.c
> +++ b/fs/xfs/linux-2.6/xfs_super.c
> @@ -1731,6 +1731,7 @@ init_xfs_fs(void)
>  	error = register_filesystem(&xfs_fs_type);
>  	if (error)
>  		goto out_sysctl_unregister;
> +	init_fs_nl_family();

Why do we have to call this from the filesystem?  Shouldn't we
initialize it once from the VFS?



Also any chance you could include the quota netlink warnings into the
framework?  Any callers is also going to be interestested in quota
warnings, not just enospc.  Also the xfs project quota code returns
ENOSPC if over the project quota and needs to be handled either way.
Adding another new category inbetween user/group quotas and plain ENOSPC
for it would be nice.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] xfs: use fs netlink interface for ENOSPC conditions
@ 2011-08-25  5:16     ` Christoph Hellwig
  0 siblings, 0 replies; 17+ messages in thread
From: Christoph Hellwig @ 2011-08-25  5:16 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-fsdevel, xfs, linux-kernel, Christoph Hellwig

>  	xfs_rw_iunlock(ip, iolock);
> +	if (unlikely(-ENOSPC == ret))
> +		fs_nl_send_warning(inode->i_sb->s_dev, FS_NL_ENOSPC_WARN);

I'd remove the nl from both the name and the constants.  In the end what
matters is the warning, and netlink just is an implementation detail.

> index 9a72dda..dd167a0 100644
> --- a/fs/xfs/linux-2.6/xfs_super.c
> +++ b/fs/xfs/linux-2.6/xfs_super.c
> @@ -1731,6 +1731,7 @@ init_xfs_fs(void)
>  	error = register_filesystem(&xfs_fs_type);
>  	if (error)
>  		goto out_sysctl_unregister;
> +	init_fs_nl_family();

Why do we have to call this from the filesystem?  Shouldn't we
initialize it once from the VFS?



Also any chance you could include the quota netlink warnings into the
framework?  Any callers is also going to be interestested in quota
warnings, not just enospc.  Also the xfs project quota code returns
ENOSPC if over the project quota and needs to be handled either way.
Adding another new category inbetween user/group quotas and plain ENOSPC
for it would be nice.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] xfs: use fs netlink interface for ENOSPC conditions
  2011-08-25  5:16     ` Christoph Hellwig
@ 2011-08-25  8:58       ` Lukas Czerner
  -1 siblings, 0 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-25  8:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Lukas Czerner, linux-kernel, linux-fsdevel, Christoph Hellwig, xfs

On Thu, 25 Aug 2011, Christoph Hellwig wrote:

> >  	xfs_rw_iunlock(ip, iolock);
> > +	if (unlikely(-ENOSPC == ret))
> > +		fs_nl_send_warning(inode->i_sb->s_dev, FS_NL_ENOSPC_WARN);
> 
> I'd remove the nl from both the name and the constants.  In the end what
> matters is the warning, and netlink just is an implementation detail.

That makes sense, I'll change that.

> 
> > index 9a72dda..dd167a0 100644
> > --- a/fs/xfs/linux-2.6/xfs_super.c
> > +++ b/fs/xfs/linux-2.6/xfs_super.c
> > @@ -1731,6 +1731,7 @@ init_xfs_fs(void)
> >  	error = register_filesystem(&xfs_fs_type);
> >  	if (error)
> >  		goto out_sysctl_unregister;
> > +	init_fs_nl_family();
> 
> Why do we have to call this from the filesystem?  Shouldn't we
> initialize it once from the VFS?

That's probably better than initialization in every file system. It
would also mean that the interface will be initializes even if there is
no user of it on the kernel side, but that should not be a problem I
guess.

> 
> 
> 
> Also any chance you could include the quota netlink warnings into the
> framework?  Any callers is also going to be interestested in quota
> warnings, not just enospc.  Also the xfs project quota code returns
> ENOSPC if over the project quota and needs to be handled either way.
> Adding another new category inbetween user/group quotas and plain ENOSPC
> for it would be nice.

That's the plan. I want to merge quota notification into this interface
as well (rather than having separate netlink interface
fs/quota/netlink.c). So, I'll try to do that in the next version of the
patches.

Thanks!
-Lukas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] xfs: use fs netlink interface for ENOSPC conditions
@ 2011-08-25  8:58       ` Lukas Czerner
  0 siblings, 0 replies; 17+ messages in thread
From: Lukas Czerner @ 2011-08-25  8:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: linux-fsdevel, Lukas Czerner, xfs, linux-kernel, Christoph Hellwig

On Thu, 25 Aug 2011, Christoph Hellwig wrote:

> >  	xfs_rw_iunlock(ip, iolock);
> > +	if (unlikely(-ENOSPC == ret))
> > +		fs_nl_send_warning(inode->i_sb->s_dev, FS_NL_ENOSPC_WARN);
> 
> I'd remove the nl from both the name and the constants.  In the end what
> matters is the warning, and netlink just is an implementation detail.

That makes sense, I'll change that.

> 
> > index 9a72dda..dd167a0 100644
> > --- a/fs/xfs/linux-2.6/xfs_super.c
> > +++ b/fs/xfs/linux-2.6/xfs_super.c
> > @@ -1731,6 +1731,7 @@ init_xfs_fs(void)
> >  	error = register_filesystem(&xfs_fs_type);
> >  	if (error)
> >  		goto out_sysctl_unregister;
> > +	init_fs_nl_family();
> 
> Why do we have to call this from the filesystem?  Shouldn't we
> initialize it once from the VFS?

That's probably better than initialization in every file system. It
would also mean that the interface will be initializes even if there is
no user of it on the kernel side, but that should not be a problem I
guess.

> 
> 
> 
> Also any chance you could include the quota netlink warnings into the
> framework?  Any callers is also going to be interestested in quota
> warnings, not just enospc.  Also the xfs project quota code returns
> ENOSPC if over the project quota and needs to be handled either way.
> Adding another new category inbetween user/group quotas and plain ENOSPC
> for it would be nice.

That's the plan. I want to merge quota notification into this interface
as well (rather than having separate netlink interface
fs/quota/netlink.c). So, I'll try to do that in the next version of the
patches.

Thanks!
-Lukas

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2011-08-25  8:58 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-18 12:18 [RFC][PATCH 0/5] Add netlink file system notification interface Lukas Czerner
2011-08-18 12:18 ` [PATCH 1/5] fs: add netlink " Lukas Czerner
2011-08-18 15:00   ` Randy Dunlap
2011-08-18 15:42     ` Lukas Czerner
2011-08-19 15:15   ` Ted Ts'o
2011-08-23  9:03     ` Lukas Czerner
2011-08-18 12:18 ` [PATCH 2/5] ext3: use fs netlink interface for ENOSPC conditions Lukas Czerner
2011-08-18 12:18 ` [PATCH 3/5] ext4: " Lukas Czerner
2011-08-18 12:18 ` [PATCH 4/5] xfs: " Lukas Czerner
2011-08-18 12:18   ` Lukas Czerner
2011-08-25  5:16   ` Christoph Hellwig
2011-08-25  5:16     ` Christoph Hellwig
2011-08-25  8:58     ` Lukas Czerner
2011-08-25  8:58       ` Lukas Czerner
2011-08-18 12:18 ` [PATCH 5/5] btrfs: " Lukas Czerner
2011-08-18 18:22   ` David Sterba
2011-08-18 19:03     ` Lukas Czerner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.