All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/4] Add online file check feature
@ 2015-10-28  6:25 ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:25 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

When there are errors in the ocfs2 filesystem,
they are usually accompanied by the inode number which caused the error.
This inode number would be the input to fixing the file.
One of these options could be considered:
A file in the sys filesytem which would accept inode numbers.
This could be used to communication back what has to be fixed or is fixed.
You could write:
$# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
or
$# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck

Compare with first version, I use strncasecmp instead of double strncmp
functions. Second, update the source file contribution vendor.

Gang He (4):
  ocfs2: export ocfs2_kset for online file check
  ocfs2: sysfile interfaces for online file check
  ocfs2: create/remove sysfile for online file check
  ocfs2: check/fix inode block for online file check

 fs/ocfs2/Makefile      |   3 +-
 fs/ocfs2/filecheck.c   | 566 +++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ocfs2/filecheck.h   |  48 +++++
 fs/ocfs2/inode.c       | 196 ++++++++++++++++-
 fs/ocfs2/inode.h       |   3 +
 fs/ocfs2/ocfs2_trace.h |   2 +
 fs/ocfs2/stackglue.c   |   3 +-
 fs/ocfs2/stackglue.h   |   2 +
 fs/ocfs2/super.c       |   5 +
 9 files changed, 820 insertions(+), 8 deletions(-)
 create mode 100644 fs/ocfs2/filecheck.c
 create mode 100644 fs/ocfs2/filecheck.h

-- 
2.1.2


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-10-28  6:25 ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:25 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

When there are errors in the ocfs2 filesystem,
they are usually accompanied by the inode number which caused the error.
This inode number would be the input to fixing the file.
One of these options could be considered:
A file in the sys filesytem which would accept inode numbers.
This could be used to communication back what has to be fixed or is fixed.
You could write:
$# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
or
$# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck

Compare with first version, I use strncasecmp instead of double strncmp
functions. Second, update the source file contribution vendor.

Gang He (4):
  ocfs2: export ocfs2_kset for online file check
  ocfs2: sysfile interfaces for online file check
  ocfs2: create/remove sysfile for online file check
  ocfs2: check/fix inode block for online file check

 fs/ocfs2/Makefile      |   3 +-
 fs/ocfs2/filecheck.c   | 566 +++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ocfs2/filecheck.h   |  48 +++++
 fs/ocfs2/inode.c       | 196 ++++++++++++++++-
 fs/ocfs2/inode.h       |   3 +
 fs/ocfs2/ocfs2_trace.h |   2 +
 fs/ocfs2/stackglue.c   |   3 +-
 fs/ocfs2/stackglue.h   |   2 +
 fs/ocfs2/super.c       |   5 +
 9 files changed, 820 insertions(+), 8 deletions(-)
 create mode 100644 fs/ocfs2/filecheck.c
 create mode 100644 fs/ocfs2/filecheck.h

-- 
2.1.2

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [PATCH v2 1/4] ocfs2: export ocfs2_kset for online file check
  2015-10-28  6:25 ` [Ocfs2-devel] " Gang He
@ 2015-10-28  6:25   ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:25 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

Export ocfs2_kset object from ocfs2_stackglue kernel module,
then online file check code will create the related sysfiles
under ocfs2_kset object.

Signed-off-by: Gang He <ghe@suse.com>
---
 fs/ocfs2/stackglue.c | 3 ++-
 fs/ocfs2/stackglue.h | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/stackglue.c b/fs/ocfs2/stackglue.c
index 5d965e8..13219ed 100644
--- a/fs/ocfs2/stackglue.c
+++ b/fs/ocfs2/stackglue.c
@@ -629,7 +629,8 @@ static struct attribute_group ocfs2_attr_group = {
 	.attrs = ocfs2_attrs,
 };
 
-static struct kset *ocfs2_kset;
+struct kset *ocfs2_kset;
+EXPORT_SYMBOL_GPL(ocfs2_kset);
 
 static void ocfs2_sysfs_exit(void)
 {
diff --git a/fs/ocfs2/stackglue.h b/fs/ocfs2/stackglue.h
index 66334a3..f2dce10 100644
--- a/fs/ocfs2/stackglue.h
+++ b/fs/ocfs2/stackglue.h
@@ -298,4 +298,6 @@ void ocfs2_stack_glue_set_max_proto_version(struct ocfs2_protocol_version *max_p
 int ocfs2_stack_glue_register(struct ocfs2_stack_plugin *plugin);
 void ocfs2_stack_glue_unregister(struct ocfs2_stack_plugin *plugin);
 
+extern struct kset *ocfs2_kset;
+
 #endif  /* STACKGLUE_H */
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 1/4] ocfs2: export ocfs2_kset for online file check
@ 2015-10-28  6:25   ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:25 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

Export ocfs2_kset object from ocfs2_stackglue kernel module,
then online file check code will create the related sysfiles
under ocfs2_kset object.

Signed-off-by: Gang He <ghe@suse.com>
---
 fs/ocfs2/stackglue.c | 3 ++-
 fs/ocfs2/stackglue.h | 2 ++
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/stackglue.c b/fs/ocfs2/stackglue.c
index 5d965e8..13219ed 100644
--- a/fs/ocfs2/stackglue.c
+++ b/fs/ocfs2/stackglue.c
@@ -629,7 +629,8 @@ static struct attribute_group ocfs2_attr_group = {
 	.attrs = ocfs2_attrs,
 };
 
-static struct kset *ocfs2_kset;
+struct kset *ocfs2_kset;
+EXPORT_SYMBOL_GPL(ocfs2_kset);
 
 static void ocfs2_sysfs_exit(void)
 {
diff --git a/fs/ocfs2/stackglue.h b/fs/ocfs2/stackglue.h
index 66334a3..f2dce10 100644
--- a/fs/ocfs2/stackglue.h
+++ b/fs/ocfs2/stackglue.h
@@ -298,4 +298,6 @@ void ocfs2_stack_glue_set_max_proto_version(struct ocfs2_protocol_version *max_p
 int ocfs2_stack_glue_register(struct ocfs2_stack_plugin *plugin);
 void ocfs2_stack_glue_unregister(struct ocfs2_stack_plugin *plugin);
 
+extern struct kset *ocfs2_kset;
+
 #endif  /* STACKGLUE_H */
-- 
2.1.2

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-10-28  6:25 ` [Ocfs2-devel] " Gang He
@ 2015-10-28  6:25   ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:25 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

Implement online file check sysfile interfaces, e.g.
how to create the related sysfile according to device name,
how to display/handle file check request from the sysfile.

Signed-off-by: Gang He <ghe@suse.com>
---
 fs/ocfs2/Makefile    |   3 +-
 fs/ocfs2/filecheck.c | 566 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ocfs2/filecheck.h |  48 +++++
 fs/ocfs2/inode.h     |   3 +
 4 files changed, 619 insertions(+), 1 deletion(-)
 create mode 100644 fs/ocfs2/filecheck.c
 create mode 100644 fs/ocfs2/filecheck.h

diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
index ce210d4..e27e652 100644
--- a/fs/ocfs2/Makefile
+++ b/fs/ocfs2/Makefile
@@ -41,7 +41,8 @@ ocfs2-objs := \
 	quota_local.o		\
 	quota_global.o		\
 	xattr.o			\
-	acl.o
+	acl.o	\
+	filecheck.o
 
 ocfs2_stackglue-objs := stackglue.o
 ocfs2_stack_o2cb-objs := stack_o2cb.o
diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
new file mode 100644
index 0000000..f12ed1f
--- /dev/null
+++ b/fs/ocfs2/filecheck.c
@@ -0,0 +1,566 @@
+/* -*- mode: c; c-basic-offset: 8; -*-
+ * vim: noexpandtab sw=8 ts=8 sts=0:
+ *
+ * filecheck.c
+ *
+ * Code which implements online file check.
+ *
+ * Copyright (C) 2015 Novell.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/kmod.h>
+#include <linux/fs.h>
+#include <linux/kobject.h>
+#include <linux/sysfs.h>
+#include <linux/sysctl.h>
+#include <cluster/masklog.h>
+
+#include "ocfs2.h"
+#include "ocfs2_fs.h"
+#include "stackglue.h"
+#include "inode.h"
+
+#include "filecheck.h"
+
+
+/* File check error strings,
+ * must correspond with error number in header file.
+ */
+static const char * const ocfs2_filecheck_errs[] = {
+	"SUCCESS",
+	"FAILED",
+	"INPROGRESS",
+	"READONLY",
+	"INVALIDINO",
+	"BLOCKECC",
+	"BLOCKNO",
+	"VALIDFLAG",
+	"GENERATION",
+	"UNSUPPORTED"
+};
+
+static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
+static LIST_HEAD(ocfs2_filecheck_sysfs_list);
+
+struct ocfs2_filecheck {
+	struct list_head fc_head;	/* File check entry list head */
+	spinlock_t fc_lock;
+	unsigned int fc_max;	/* Maximum number of entry in list */
+	unsigned int fc_size;	/* Current entry count in list */
+	unsigned int fc_done;	/* File check entries are done in list */
+};
+
+struct ocfs2_filecheck_sysfs_entry {
+	struct list_head fs_list;
+	atomic_t fs_count;
+	struct super_block *fs_sb;
+	struct kset *fs_kset;
+	struct ocfs2_filecheck *fs_fcheck;
+};
+
+#define OCFS2_FILECHECK_MAXSIZE		100
+#define OCFS2_FILECHECK_MINSIZE		10
+
+/* File check operation type */
+enum {
+	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
+	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
+	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
+};
+
+struct ocfs2_filecheck_entry {
+	struct list_head fe_list;
+	unsigned long fe_ino;
+	unsigned int fe_type;
+	unsigned short fe_done:1;
+	unsigned short fe_status:15;
+};
+
+struct ocfs2_filecheck_args {
+	unsigned int fa_type;
+	union {
+		unsigned long fa_ino;
+		unsigned int fa_len;
+	};
+};
+
+static const char *
+ocfs2_filecheck_error(int errno)
+{
+	if (!errno)
+		return ocfs2_filecheck_errs[errno];
+
+	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
+			errno > OCFS2_FILECHECK_ERR_END);
+	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
+}
+
+static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
+					struct kobj_attribute *attr,
+					char *buf);
+static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
+					struct kobj_attribute *attr,
+					const char *buf, size_t count);
+static struct kobj_attribute ocfs2_attr_filecheck =
+					__ATTR(filecheck, S_IRUSR | S_IWUSR,
+					ocfs2_filecheck_show,
+					ocfs2_filecheck_store);
+
+static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
+{
+	schedule();
+	return 0;
+}
+
+static void
+ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
+{
+	struct ocfs2_filecheck_entry *p;
+
+	if (!atomic_dec_and_test(&entry->fs_count))
+		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
+						TASK_UNINTERRUPTIBLE);
+
+	spin_lock(&entry->fs_fcheck->fc_lock);
+	while (!list_empty(&entry->fs_fcheck->fc_head)) {
+		p = list_first_entry(&entry->fs_fcheck->fc_head,
+				struct ocfs2_filecheck_entry, fe_list);
+		list_del(&p->fe_list);
+		BUG_ON(!p->fe_done); /* To free a undone file check entry */
+		kfree(p);
+	}
+	spin_unlock(&entry->fs_fcheck->fc_lock);
+
+	kset_unregister(entry->fs_kset);
+	kfree(entry->fs_fcheck);
+	kfree(entry);
+}
+
+static void
+ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
+{
+	spin_lock(&ocfs2_filecheck_sysfs_lock);
+	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
+	spin_unlock(&ocfs2_filecheck_sysfs_lock);
+}
+
+static int ocfs2_filecheck_sysfs_del(const char *devname)
+{
+	struct ocfs2_filecheck_sysfs_entry *p;
+
+	spin_lock(&ocfs2_filecheck_sysfs_lock);
+	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
+		if (!strcmp(p->fs_sb->s_id, devname)) {
+			list_del(&p->fs_list);
+			spin_unlock(&ocfs2_filecheck_sysfs_lock);
+			ocfs2_filecheck_sysfs_free(p);
+			return 0;
+		}
+	}
+	spin_unlock(&ocfs2_filecheck_sysfs_lock);
+	return 1;
+}
+
+static void
+ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
+{
+	if (atomic_dec_and_test(&entry->fs_count))
+		wake_up_atomic_t(&entry->fs_count);
+}
+
+static struct ocfs2_filecheck_sysfs_entry *
+ocfs2_filecheck_sysfs_get(const char *devname)
+{
+	struct ocfs2_filecheck_sysfs_entry *p = NULL;
+
+	spin_lock(&ocfs2_filecheck_sysfs_lock);
+	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
+		if (!strcmp(p->fs_sb->s_id, devname)) {
+			atomic_inc(&p->fs_count);
+			spin_unlock(&ocfs2_filecheck_sysfs_lock);
+			return p;
+		}
+	}
+	spin_unlock(&ocfs2_filecheck_sysfs_lock);
+	return NULL;
+}
+
+int ocfs2_filecheck_create_sysfs(struct super_block *sb)
+{
+	int ret = 0;
+	struct kset *ocfs2_filecheck_kset = NULL;
+	struct ocfs2_filecheck *fcheck = NULL;
+	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
+	struct attribute **attrs = NULL;
+	struct attribute_group attrgp;
+
+	if (!ocfs2_kset)
+		return -ENOMEM;
+
+	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
+	if (!attrs) {
+		ret = -ENOMEM;
+		goto error;
+	} else {
+		attrs[0] = &ocfs2_attr_filecheck.attr;
+		attrs[1] = NULL;
+		memset(&attrgp, 0, sizeof(attrgp));
+		attrgp.attrs = attrs;
+	}
+
+	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
+	if (!fcheck) {
+		ret = -ENOMEM;
+		goto error;
+	} else {
+		INIT_LIST_HEAD(&fcheck->fc_head);
+		spin_lock_init(&fcheck->fc_lock);
+		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
+		fcheck->fc_size = 0;
+		fcheck->fc_done = 0;
+	}
+
+	if (strlen(sb->s_id) <= 0) {
+		mlog(ML_ERROR,
+		"Cannot get device basename when create filecheck sysfs\n");
+		ret = -ENODEV;
+		goto error;
+	}
+
+	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
+						&ocfs2_kset->kobj);
+	if (!ocfs2_filecheck_kset) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
+	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
+	if (ret)
+		goto error;
+
+	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
+	if (!entry) {
+		ret = -ENOMEM;
+		goto error;
+	} else {
+		atomic_set(&entry->fs_count, 1);
+		entry->fs_sb = sb;
+		entry->fs_kset = ocfs2_filecheck_kset;
+		entry->fs_fcheck = fcheck;
+		ocfs2_filecheck_sysfs_add(entry);
+	}
+
+	kfree(attrs);
+	return 0;
+
+error:
+	kfree(attrs);
+	kfree(entry);
+	kfree(fcheck);
+	kset_unregister(ocfs2_filecheck_kset);
+	return ret;
+}
+
+int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
+{
+	return ocfs2_filecheck_sysfs_del(sb->s_id);
+}
+
+static int
+ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
+				unsigned int count);
+static int
+ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
+				unsigned int len)
+{
+	int ret;
+
+	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
+		return -EINVAL;
+
+	spin_lock(&ent->fs_fcheck->fc_lock);
+	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
+		mlog(ML_ERROR,
+		"Cannot set online file check maximum entry number "
+		"to %u due to too much pending entries(%u)\n",
+		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
+		ret = -EBUSY;
+	} else {
+		if (len < ent->fs_fcheck->fc_size)
+			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
+				ent->fs_fcheck->fc_size - len));
+
+		ent->fs_fcheck->fc_max = len;
+		ret = 0;
+	}
+	spin_unlock(&ent->fs_fcheck->fc_lock);
+
+	return ret;
+}
+
+#define OCFS2_FILECHECK_ARGS_LEN	32
+static int
+ocfs2_filecheck_args_get_long(const char *buf, size_t count,
+				unsigned long *val)
+{
+	char buffer[OCFS2_FILECHECK_ARGS_LEN];
+
+	if (count < 1)
+		return 1;
+
+	memcpy(buffer, buf, count);
+	buffer[count] = '\0';
+
+	if (kstrtoul(buffer, 0, val))
+		return 1;
+
+	return 0;
+}
+
+static int
+ocfs2_filecheck_args_parse(const char *buf, size_t count,
+				struct ocfs2_filecheck_args *args)
+{
+	unsigned long val = 0;
+
+	/* too short/long args length */
+	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
+		return 1;
+
+	if (!strncasecmp(buf, "FIX ", 4)) {
+		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
+			return 1;
+
+		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
+		args->fa_ino = val;
+		return 0;
+	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
+		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
+			return 1;
+
+		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
+		args->fa_ino = val;
+		return 0;
+	} else if (!strncasecmp(buf, "SET ", 4)) {
+		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
+			return 1;
+
+		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
+		args->fa_len = (unsigned int)val;
+		return 0;
+	} else { /* invalid args */
+		return 1;
+	}
+}
+
+static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
+					struct kobj_attribute *attr,
+					char *buf)
+{
+
+	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
+	struct ocfs2_filecheck_entry *p;
+	struct ocfs2_filecheck_sysfs_entry *ent;
+
+	ent = ocfs2_filecheck_sysfs_get(kobj->name);
+	if (!ent) {
+		mlog(ML_ERROR,
+		"Cannot get the corresponding entry via device basename %s\n",
+		kobj->name);
+		return -ENODEV;
+	}
+
+	spin_lock(&ent->fs_fcheck->fc_lock);
+	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
+	total += ret;
+	remain -= ret;
+
+	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
+		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
+			p->fe_ino, p->fe_type, p->fe_done,
+			ocfs2_filecheck_error(p->fe_status));
+		if (ret < 0) {
+			total = ret;
+			break;
+		}
+		if (ret == remain) {
+			/* snprintf() didn't fit */
+			total = -E2BIG;
+			break;
+		}
+		total += ret;
+		remain -= ret;
+	}
+	spin_unlock(&ent->fs_fcheck->fc_lock);
+
+	ocfs2_filecheck_sysfs_put(ent);
+	return total;
+}
+
+static int
+ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
+{
+	struct ocfs2_filecheck_entry *p;
+
+	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
+		if (p->fe_done) {
+			list_del(&p->fe_list);
+			kfree(p);
+			ent->fs_fcheck->fc_size--;
+			ent->fs_fcheck->fc_done--;
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+static int
+ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
+				unsigned int count)
+{
+	unsigned int i = 0;
+	unsigned int ret = 0;
+
+	while (i++ < count) {
+		if (ocfs2_filecheck_erase_entry(ent))
+			ret++;
+		else
+			break;
+	}
+
+	return (ret == count ? 1 : 0);
+}
+
+static void
+ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
+				struct ocfs2_filecheck_entry *entry)
+{
+	entry->fe_done = 1;
+	spin_lock(&ent->fs_fcheck->fc_lock);
+	ent->fs_fcheck->fc_done++;
+	spin_unlock(&ent->fs_fcheck->fc_lock);
+}
+
+static unsigned short
+ocfs2_filecheck_handle(struct super_block *sb,
+				unsigned long ino, unsigned int flags)
+{
+	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
+	struct inode *inode = NULL;
+	int rc;
+
+	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
+	if (IS_ERR(inode)) {
+		rc = (int)(-(long)inode);
+		if (rc >= OCFS2_FILECHECK_ERR_START &&
+			rc < OCFS2_FILECHECK_ERR_END)
+			ret = rc;
+		else
+			ret = OCFS2_FILECHECK_ERR_FAILED;
+	} else
+		iput(inode);
+
+	return ret;
+}
+
+static void
+ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
+				struct ocfs2_filecheck_entry *entry)
+{
+	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
+		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
+				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
+	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
+		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
+				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
+	else
+		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
+
+	ocfs2_filecheck_done_entry(ent, entry);
+}
+
+static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				const char *buf, size_t count)
+{
+	struct ocfs2_filecheck_args args;
+	struct ocfs2_filecheck_entry *entry = NULL;
+	struct ocfs2_filecheck_sysfs_entry *ent;
+	ssize_t ret = 0;
+
+	if (count == 0)
+		return count;
+
+	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
+		mlog(ML_ERROR, "Invalid arguments for online file check\n");
+		return -EINVAL;
+	}
+
+	ent = ocfs2_filecheck_sysfs_get(kobj->name);
+	if (!ent) {
+		mlog(ML_ERROR,
+		"Cannot get the corresponding entry via device basename %s\n",
+		kobj->name);
+		return -ENODEV;
+	}
+
+	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
+		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
+		ocfs2_filecheck_sysfs_put(ent);
+		return (!ret ? count : ret);
+	}
+
+	spin_lock(&ent->fs_fcheck->fc_lock);
+	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
+		(ent->fs_fcheck->fc_done == 0)) {
+		mlog(ML_ERROR,
+		"Online file check queue(%u) is full\n",
+		ent->fs_fcheck->fc_max);
+		ret = -EBUSY;
+	} else {
+		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
+			(ent->fs_fcheck->fc_done > 0)) {
+			/* Delete the oldest entry which was done,
+			 * make sure the entry size in list does
+			 * not exceed maximum value
+			 */
+			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
+		}
+
+		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
+		if (entry) {
+			entry->fe_ino = args.fa_ino;
+			entry->fe_type = args.fa_type;
+			entry->fe_done = 0;
+			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
+			list_add_tail(&entry->fe_list,
+					&ent->fs_fcheck->fc_head);
+
+			ent->fs_fcheck->fc_size++;
+			ret = count;
+		} else {
+			ret = -ENOMEM;
+		}
+	}
+	spin_unlock(&ent->fs_fcheck->fc_lock);
+
+	if (entry)
+		ocfs2_filecheck_handle_entry(ent, entry);
+
+	ocfs2_filecheck_sysfs_put(ent);
+	return ret;
+}
diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
new file mode 100644
index 0000000..5ec331b
--- /dev/null
+++ b/fs/ocfs2/filecheck.h
@@ -0,0 +1,48 @@
+/* -*- mode: c; c-basic-offset: 8; -*-
+ * vim: noexpandtab sw=8 ts=8 sts=0:
+ *
+ * filecheck.h
+ *
+ * Online file check.
+ *
+ * Copyright (C) 2015 Novell.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+
+#ifndef FILECHECK_H
+#define FILECHECK_H
+
+#include <linux/types.h>
+#include <linux/list.h>
+
+
+/* File check errno */
+enum {
+	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
+	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
+	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
+	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
+	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
+	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
+	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
+	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
+	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
+	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
+};
+
+#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
+#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
+
+int ocfs2_filecheck_create_sysfs(struct super_block *sb);
+int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
+
+#endif  /* FILECHECK_H */
diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
index 5e86b24..abd1018 100644
--- a/fs/ocfs2/inode.h
+++ b/fs/ocfs2/inode.h
@@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
 /* Flags for ocfs2_iget() */
 #define OCFS2_FI_FLAG_SYSFILE		0x1
 #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
+#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
+#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
+
 struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
 struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned flags,
 			 int sysfile_type);
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-10-28  6:25   ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:25 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

Implement online file check sysfile interfaces, e.g.
how to create the related sysfile according to device name,
how to display/handle file check request from the sysfile.

Signed-off-by: Gang He <ghe@suse.com>
---
 fs/ocfs2/Makefile    |   3 +-
 fs/ocfs2/filecheck.c | 566 +++++++++++++++++++++++++++++++++++++++++++++++++++
 fs/ocfs2/filecheck.h |  48 +++++
 fs/ocfs2/inode.h     |   3 +
 4 files changed, 619 insertions(+), 1 deletion(-)
 create mode 100644 fs/ocfs2/filecheck.c
 create mode 100644 fs/ocfs2/filecheck.h

diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
index ce210d4..e27e652 100644
--- a/fs/ocfs2/Makefile
+++ b/fs/ocfs2/Makefile
@@ -41,7 +41,8 @@ ocfs2-objs := \
 	quota_local.o		\
 	quota_global.o		\
 	xattr.o			\
-	acl.o
+	acl.o	\
+	filecheck.o
 
 ocfs2_stackglue-objs := stackglue.o
 ocfs2_stack_o2cb-objs := stack_o2cb.o
diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
new file mode 100644
index 0000000..f12ed1f
--- /dev/null
+++ b/fs/ocfs2/filecheck.c
@@ -0,0 +1,566 @@
+/* -*- mode: c; c-basic-offset: 8; -*-
+ * vim: noexpandtab sw=8 ts=8 sts=0:
+ *
+ * filecheck.c
+ *
+ * Code which implements online file check.
+ *
+ * Copyright (C) 2015 Novell.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#include <linux/list.h>
+#include <linux/spinlock.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/kmod.h>
+#include <linux/fs.h>
+#include <linux/kobject.h>
+#include <linux/sysfs.h>
+#include <linux/sysctl.h>
+#include <cluster/masklog.h>
+
+#include "ocfs2.h"
+#include "ocfs2_fs.h"
+#include "stackglue.h"
+#include "inode.h"
+
+#include "filecheck.h"
+
+
+/* File check error strings,
+ * must correspond with error number in header file.
+ */
+static const char * const ocfs2_filecheck_errs[] = {
+	"SUCCESS",
+	"FAILED",
+	"INPROGRESS",
+	"READONLY",
+	"INVALIDINO",
+	"BLOCKECC",
+	"BLOCKNO",
+	"VALIDFLAG",
+	"GENERATION",
+	"UNSUPPORTED"
+};
+
+static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
+static LIST_HEAD(ocfs2_filecheck_sysfs_list);
+
+struct ocfs2_filecheck {
+	struct list_head fc_head;	/* File check entry list head */
+	spinlock_t fc_lock;
+	unsigned int fc_max;	/* Maximum number of entry in list */
+	unsigned int fc_size;	/* Current entry count in list */
+	unsigned int fc_done;	/* File check entries are done in list */
+};
+
+struct ocfs2_filecheck_sysfs_entry {
+	struct list_head fs_list;
+	atomic_t fs_count;
+	struct super_block *fs_sb;
+	struct kset *fs_kset;
+	struct ocfs2_filecheck *fs_fcheck;
+};
+
+#define OCFS2_FILECHECK_MAXSIZE		100
+#define OCFS2_FILECHECK_MINSIZE		10
+
+/* File check operation type */
+enum {
+	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
+	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
+	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
+};
+
+struct ocfs2_filecheck_entry {
+	struct list_head fe_list;
+	unsigned long fe_ino;
+	unsigned int fe_type;
+	unsigned short fe_done:1;
+	unsigned short fe_status:15;
+};
+
+struct ocfs2_filecheck_args {
+	unsigned int fa_type;
+	union {
+		unsigned long fa_ino;
+		unsigned int fa_len;
+	};
+};
+
+static const char *
+ocfs2_filecheck_error(int errno)
+{
+	if (!errno)
+		return ocfs2_filecheck_errs[errno];
+
+	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
+			errno > OCFS2_FILECHECK_ERR_END);
+	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
+}
+
+static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
+					struct kobj_attribute *attr,
+					char *buf);
+static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
+					struct kobj_attribute *attr,
+					const char *buf, size_t count);
+static struct kobj_attribute ocfs2_attr_filecheck =
+					__ATTR(filecheck, S_IRUSR | S_IWUSR,
+					ocfs2_filecheck_show,
+					ocfs2_filecheck_store);
+
+static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
+{
+	schedule();
+	return 0;
+}
+
+static void
+ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
+{
+	struct ocfs2_filecheck_entry *p;
+
+	if (!atomic_dec_and_test(&entry->fs_count))
+		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
+						TASK_UNINTERRUPTIBLE);
+
+	spin_lock(&entry->fs_fcheck->fc_lock);
+	while (!list_empty(&entry->fs_fcheck->fc_head)) {
+		p = list_first_entry(&entry->fs_fcheck->fc_head,
+				struct ocfs2_filecheck_entry, fe_list);
+		list_del(&p->fe_list);
+		BUG_ON(!p->fe_done); /* To free a undone file check entry */
+		kfree(p);
+	}
+	spin_unlock(&entry->fs_fcheck->fc_lock);
+
+	kset_unregister(entry->fs_kset);
+	kfree(entry->fs_fcheck);
+	kfree(entry);
+}
+
+static void
+ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
+{
+	spin_lock(&ocfs2_filecheck_sysfs_lock);
+	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
+	spin_unlock(&ocfs2_filecheck_sysfs_lock);
+}
+
+static int ocfs2_filecheck_sysfs_del(const char *devname)
+{
+	struct ocfs2_filecheck_sysfs_entry *p;
+
+	spin_lock(&ocfs2_filecheck_sysfs_lock);
+	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
+		if (!strcmp(p->fs_sb->s_id, devname)) {
+			list_del(&p->fs_list);
+			spin_unlock(&ocfs2_filecheck_sysfs_lock);
+			ocfs2_filecheck_sysfs_free(p);
+			return 0;
+		}
+	}
+	spin_unlock(&ocfs2_filecheck_sysfs_lock);
+	return 1;
+}
+
+static void
+ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
+{
+	if (atomic_dec_and_test(&entry->fs_count))
+		wake_up_atomic_t(&entry->fs_count);
+}
+
+static struct ocfs2_filecheck_sysfs_entry *
+ocfs2_filecheck_sysfs_get(const char *devname)
+{
+	struct ocfs2_filecheck_sysfs_entry *p = NULL;
+
+	spin_lock(&ocfs2_filecheck_sysfs_lock);
+	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
+		if (!strcmp(p->fs_sb->s_id, devname)) {
+			atomic_inc(&p->fs_count);
+			spin_unlock(&ocfs2_filecheck_sysfs_lock);
+			return p;
+		}
+	}
+	spin_unlock(&ocfs2_filecheck_sysfs_lock);
+	return NULL;
+}
+
+int ocfs2_filecheck_create_sysfs(struct super_block *sb)
+{
+	int ret = 0;
+	struct kset *ocfs2_filecheck_kset = NULL;
+	struct ocfs2_filecheck *fcheck = NULL;
+	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
+	struct attribute **attrs = NULL;
+	struct attribute_group attrgp;
+
+	if (!ocfs2_kset)
+		return -ENOMEM;
+
+	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
+	if (!attrs) {
+		ret = -ENOMEM;
+		goto error;
+	} else {
+		attrs[0] = &ocfs2_attr_filecheck.attr;
+		attrs[1] = NULL;
+		memset(&attrgp, 0, sizeof(attrgp));
+		attrgp.attrs = attrs;
+	}
+
+	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
+	if (!fcheck) {
+		ret = -ENOMEM;
+		goto error;
+	} else {
+		INIT_LIST_HEAD(&fcheck->fc_head);
+		spin_lock_init(&fcheck->fc_lock);
+		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
+		fcheck->fc_size = 0;
+		fcheck->fc_done = 0;
+	}
+
+	if (strlen(sb->s_id) <= 0) {
+		mlog(ML_ERROR,
+		"Cannot get device basename when create filecheck sysfs\n");
+		ret = -ENODEV;
+		goto error;
+	}
+
+	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
+						&ocfs2_kset->kobj);
+	if (!ocfs2_filecheck_kset) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
+	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
+	if (ret)
+		goto error;
+
+	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
+	if (!entry) {
+		ret = -ENOMEM;
+		goto error;
+	} else {
+		atomic_set(&entry->fs_count, 1);
+		entry->fs_sb = sb;
+		entry->fs_kset = ocfs2_filecheck_kset;
+		entry->fs_fcheck = fcheck;
+		ocfs2_filecheck_sysfs_add(entry);
+	}
+
+	kfree(attrs);
+	return 0;
+
+error:
+	kfree(attrs);
+	kfree(entry);
+	kfree(fcheck);
+	kset_unregister(ocfs2_filecheck_kset);
+	return ret;
+}
+
+int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
+{
+	return ocfs2_filecheck_sysfs_del(sb->s_id);
+}
+
+static int
+ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
+				unsigned int count);
+static int
+ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
+				unsigned int len)
+{
+	int ret;
+
+	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
+		return -EINVAL;
+
+	spin_lock(&ent->fs_fcheck->fc_lock);
+	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
+		mlog(ML_ERROR,
+		"Cannot set online file check maximum entry number "
+		"to %u due to too much pending entries(%u)\n",
+		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
+		ret = -EBUSY;
+	} else {
+		if (len < ent->fs_fcheck->fc_size)
+			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
+				ent->fs_fcheck->fc_size - len));
+
+		ent->fs_fcheck->fc_max = len;
+		ret = 0;
+	}
+	spin_unlock(&ent->fs_fcheck->fc_lock);
+
+	return ret;
+}
+
+#define OCFS2_FILECHECK_ARGS_LEN	32
+static int
+ocfs2_filecheck_args_get_long(const char *buf, size_t count,
+				unsigned long *val)
+{
+	char buffer[OCFS2_FILECHECK_ARGS_LEN];
+
+	if (count < 1)
+		return 1;
+
+	memcpy(buffer, buf, count);
+	buffer[count] = '\0';
+
+	if (kstrtoul(buffer, 0, val))
+		return 1;
+
+	return 0;
+}
+
+static int
+ocfs2_filecheck_args_parse(const char *buf, size_t count,
+				struct ocfs2_filecheck_args *args)
+{
+	unsigned long val = 0;
+
+	/* too short/long args length */
+	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
+		return 1;
+
+	if (!strncasecmp(buf, "FIX ", 4)) {
+		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
+			return 1;
+
+		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
+		args->fa_ino = val;
+		return 0;
+	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
+		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
+			return 1;
+
+		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
+		args->fa_ino = val;
+		return 0;
+	} else if (!strncasecmp(buf, "SET ", 4)) {
+		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
+			return 1;
+
+		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
+		args->fa_len = (unsigned int)val;
+		return 0;
+	} else { /* invalid args */
+		return 1;
+	}
+}
+
+static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
+					struct kobj_attribute *attr,
+					char *buf)
+{
+
+	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
+	struct ocfs2_filecheck_entry *p;
+	struct ocfs2_filecheck_sysfs_entry *ent;
+
+	ent = ocfs2_filecheck_sysfs_get(kobj->name);
+	if (!ent) {
+		mlog(ML_ERROR,
+		"Cannot get the corresponding entry via device basename %s\n",
+		kobj->name);
+		return -ENODEV;
+	}
+
+	spin_lock(&ent->fs_fcheck->fc_lock);
+	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
+	total += ret;
+	remain -= ret;
+
+	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
+		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
+			p->fe_ino, p->fe_type, p->fe_done,
+			ocfs2_filecheck_error(p->fe_status));
+		if (ret < 0) {
+			total = ret;
+			break;
+		}
+		if (ret == remain) {
+			/* snprintf() didn't fit */
+			total = -E2BIG;
+			break;
+		}
+		total += ret;
+		remain -= ret;
+	}
+	spin_unlock(&ent->fs_fcheck->fc_lock);
+
+	ocfs2_filecheck_sysfs_put(ent);
+	return total;
+}
+
+static int
+ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
+{
+	struct ocfs2_filecheck_entry *p;
+
+	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
+		if (p->fe_done) {
+			list_del(&p->fe_list);
+			kfree(p);
+			ent->fs_fcheck->fc_size--;
+			ent->fs_fcheck->fc_done--;
+			return 1;
+		}
+	}
+
+	return 0;
+}
+
+static int
+ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
+				unsigned int count)
+{
+	unsigned int i = 0;
+	unsigned int ret = 0;
+
+	while (i++ < count) {
+		if (ocfs2_filecheck_erase_entry(ent))
+			ret++;
+		else
+			break;
+	}
+
+	return (ret == count ? 1 : 0);
+}
+
+static void
+ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
+				struct ocfs2_filecheck_entry *entry)
+{
+	entry->fe_done = 1;
+	spin_lock(&ent->fs_fcheck->fc_lock);
+	ent->fs_fcheck->fc_done++;
+	spin_unlock(&ent->fs_fcheck->fc_lock);
+}
+
+static unsigned short
+ocfs2_filecheck_handle(struct super_block *sb,
+				unsigned long ino, unsigned int flags)
+{
+	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
+	struct inode *inode = NULL;
+	int rc;
+
+	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
+	if (IS_ERR(inode)) {
+		rc = (int)(-(long)inode);
+		if (rc >= OCFS2_FILECHECK_ERR_START &&
+			rc < OCFS2_FILECHECK_ERR_END)
+			ret = rc;
+		else
+			ret = OCFS2_FILECHECK_ERR_FAILED;
+	} else
+		iput(inode);
+
+	return ret;
+}
+
+static void
+ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
+				struct ocfs2_filecheck_entry *entry)
+{
+	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
+		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
+				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
+	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
+		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
+				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
+	else
+		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
+
+	ocfs2_filecheck_done_entry(ent, entry);
+}
+
+static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
+				struct kobj_attribute *attr,
+				const char *buf, size_t count)
+{
+	struct ocfs2_filecheck_args args;
+	struct ocfs2_filecheck_entry *entry = NULL;
+	struct ocfs2_filecheck_sysfs_entry *ent;
+	ssize_t ret = 0;
+
+	if (count == 0)
+		return count;
+
+	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
+		mlog(ML_ERROR, "Invalid arguments for online file check\n");
+		return -EINVAL;
+	}
+
+	ent = ocfs2_filecheck_sysfs_get(kobj->name);
+	if (!ent) {
+		mlog(ML_ERROR,
+		"Cannot get the corresponding entry via device basename %s\n",
+		kobj->name);
+		return -ENODEV;
+	}
+
+	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
+		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
+		ocfs2_filecheck_sysfs_put(ent);
+		return (!ret ? count : ret);
+	}
+
+	spin_lock(&ent->fs_fcheck->fc_lock);
+	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
+		(ent->fs_fcheck->fc_done == 0)) {
+		mlog(ML_ERROR,
+		"Online file check queue(%u) is full\n",
+		ent->fs_fcheck->fc_max);
+		ret = -EBUSY;
+	} else {
+		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
+			(ent->fs_fcheck->fc_done > 0)) {
+			/* Delete the oldest entry which was done,
+			 * make sure the entry size in list does
+			 * not exceed maximum value
+			 */
+			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
+		}
+
+		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
+		if (entry) {
+			entry->fe_ino = args.fa_ino;
+			entry->fe_type = args.fa_type;
+			entry->fe_done = 0;
+			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
+			list_add_tail(&entry->fe_list,
+					&ent->fs_fcheck->fc_head);
+
+			ent->fs_fcheck->fc_size++;
+			ret = count;
+		} else {
+			ret = -ENOMEM;
+		}
+	}
+	spin_unlock(&ent->fs_fcheck->fc_lock);
+
+	if (entry)
+		ocfs2_filecheck_handle_entry(ent, entry);
+
+	ocfs2_filecheck_sysfs_put(ent);
+	return ret;
+}
diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
new file mode 100644
index 0000000..5ec331b
--- /dev/null
+++ b/fs/ocfs2/filecheck.h
@@ -0,0 +1,48 @@
+/* -*- mode: c; c-basic-offset: 8; -*-
+ * vim: noexpandtab sw=8 ts=8 sts=0:
+ *
+ * filecheck.h
+ *
+ * Online file check.
+ *
+ * Copyright (C) 2015 Novell.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation, version 2.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+
+#ifndef FILECHECK_H
+#define FILECHECK_H
+
+#include <linux/types.h>
+#include <linux/list.h>
+
+
+/* File check errno */
+enum {
+	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
+	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
+	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
+	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
+	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
+	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
+	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
+	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
+	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
+	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
+};
+
+#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
+#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
+
+int ocfs2_filecheck_create_sysfs(struct super_block *sb);
+int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
+
+#endif  /* FILECHECK_H */
diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
index 5e86b24..abd1018 100644
--- a/fs/ocfs2/inode.h
+++ b/fs/ocfs2/inode.h
@@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
 /* Flags for ocfs2_iget() */
 #define OCFS2_FI_FLAG_SYSFILE		0x1
 #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
+#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
+#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
+
 struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
 struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned flags,
 			 int sysfile_type);
-- 
2.1.2

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v2 3/4] ocfs2: create/remove sysfile for online file check
  2015-10-28  6:25 ` [Ocfs2-devel] " Gang He
@ 2015-10-28  6:26   ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:26 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

Create online file check sysfile when ocfs2 mount,
remove the related sysfile when ocfs2 umount.

Signed-off-by: Gang He <ghe@suse.com>
---
 fs/ocfs2/super.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 403c566..7213a94 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -74,6 +74,7 @@
 #include "suballoc.h"
 
 #include "buffer_head_io.h"
+#include "filecheck.h"
 
 static struct kmem_cache *ocfs2_inode_cachep;
 struct kmem_cache *ocfs2_dquot_cachep;
@@ -1202,6 +1203,9 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 	/* Start this when the mount is almost sure of being successful */
 	ocfs2_orphan_scan_start(osb);
 
+	/* Create filecheck sysfile /sys/fs/ocfs2/<devname>/filecheck */
+	ocfs2_filecheck_create_sysfs(sb);
+
 	return status;
 
 read_super_error:
@@ -1658,6 +1662,7 @@ static void ocfs2_put_super(struct super_block *sb)
 
 	ocfs2_sync_blockdev(sb);
 	ocfs2_dismount_volume(sb, 0);
+	ocfs2_filecheck_remove_sysfs(sb);
 }
 
 static int ocfs2_statfs(struct dentry *dentry, struct kstatfs *buf)
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 3/4] ocfs2: create/remove sysfile for online file check
@ 2015-10-28  6:26   ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:26 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

Create online file check sysfile when ocfs2 mount,
remove the related sysfile when ocfs2 umount.

Signed-off-by: Gang He <ghe@suse.com>
---
 fs/ocfs2/super.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 403c566..7213a94 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -74,6 +74,7 @@
 #include "suballoc.h"
 
 #include "buffer_head_io.h"
+#include "filecheck.h"
 
 static struct kmem_cache *ocfs2_inode_cachep;
 struct kmem_cache *ocfs2_dquot_cachep;
@@ -1202,6 +1203,9 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 	/* Start this when the mount is almost sure of being successful */
 	ocfs2_orphan_scan_start(osb);
 
+	/* Create filecheck sysfile /sys/fs/ocfs2/<devname>/filecheck */
+	ocfs2_filecheck_create_sysfs(sb);
+
 	return status;
 
 read_super_error:
@@ -1658,6 +1662,7 @@ static void ocfs2_put_super(struct super_block *sb)
 
 	ocfs2_sync_blockdev(sb);
 	ocfs2_dismount_volume(sb, 0);
+	ocfs2_filecheck_remove_sysfs(sb);
 }
 
 static int ocfs2_statfs(struct dentry *dentry, struct kstatfs *buf)
-- 
2.1.2

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-10-28  6:25 ` [Ocfs2-devel] " Gang He
@ 2015-10-28  6:26   ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:26 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

Implement online check or fix inode block during
reading a inode block to memory.

Signed-off-by: Gang He <ghe@suse.com>
---
 fs/ocfs2/inode.c       | 196 +++++++++++++++++++++++++++++++++++++++++++++++--
 fs/ocfs2/ocfs2_trace.h |   2 +
 2 files changed, 192 insertions(+), 6 deletions(-)

diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index b254416..d811698 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -53,6 +53,7 @@
 #include "xattr.h"
 #include "refcounttree.h"
 #include "ocfs2_trace.h"
+#include "filecheck.h"
 
 #include "buffer_head_io.h"
 
@@ -74,6 +75,13 @@ static int ocfs2_truncate_for_delete(struct ocfs2_super *osb,
 				    struct inode *inode,
 				    struct buffer_head *fe_bh);
 
+static int ocfs2_filecheck_read_inode_block_full(struct inode *inode,
+			struct buffer_head **bh, int flags, int type);
+static int ocfs2_filecheck_validate_inode_block(struct super_block *sb,
+			struct buffer_head *bh);
+static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
+			struct buffer_head *bh);
+
 void ocfs2_set_inode_flags(struct inode *inode)
 {
 	unsigned int flags = OCFS2_I(inode)->ip_attr;
@@ -127,6 +135,7 @@ struct inode *ocfs2_ilookup(struct super_block *sb, u64 blkno)
 struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 			 int sysfile_type)
 {
+	int rc = 0;
 	struct inode *inode = NULL;
 	struct super_block *sb = osb->sb;
 	struct ocfs2_find_inode_args args;
@@ -161,12 +170,17 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 	}
 	trace_ocfs2_iget5_locked(inode->i_state);
 	if (inode->i_state & I_NEW) {
-		ocfs2_read_locked_inode(inode, &args);
+		rc = ocfs2_read_locked_inode(inode, &args);
 		unlock_new_inode(inode);
 	}
 	if (is_bad_inode(inode)) {
 		iput(inode);
-		inode = ERR_PTR(-ESTALE);
+		if ((flags & OCFS2_FI_FLAG_FILECHECK_CHK) ||
+			(flags & OCFS2_FI_FLAG_FILECHECK_FIX))
+			/* Return OCFS2_FILECHECK_ERR_XXX related errno */
+			inode = ERR_PTR(rc);
+		else
+			inode = ERR_PTR(-ESTALE);
 		goto bail;
 	}
 
@@ -494,16 +508,32 @@ static int ocfs2_read_locked_inode(struct inode *inode,
 	}
 
 	if (can_lock) {
-		status = ocfs2_read_inode_block_full(inode, &bh,
-						     OCFS2_BH_IGNORE_CACHE);
+		if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_CHK)
+			status = ocfs2_filecheck_read_inode_block_full(inode,
+						&bh, OCFS2_BH_IGNORE_CACHE, 0);
+		else if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_FIX)
+			status = ocfs2_filecheck_read_inode_block_full(inode,
+						&bh, OCFS2_BH_IGNORE_CACHE, 1);
+		else
+			status = ocfs2_read_inode_block_full(inode,
+						&bh, OCFS2_BH_IGNORE_CACHE);
 	} else {
 		status = ocfs2_read_blocks_sync(osb, args->fi_blkno, 1, &bh);
 		/*
 		 * If buffer is in jbd, then its checksum may not have been
 		 * computed as yet.
 		 */
-		if (!status && !buffer_jbd(bh))
-			status = ocfs2_validate_inode_block(osb->sb, bh);
+		if (!status && !buffer_jbd(bh)) {
+			if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_CHK)
+				status = ocfs2_filecheck_validate_inode_block(
+								osb->sb, bh);
+			else if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_FIX)
+				status = ocfs2_filecheck_repair_inode_block(
+								osb->sb, bh);
+			else
+				status = ocfs2_validate_inode_block(
+								osb->sb, bh);
+		}
 	}
 	if (status < 0) {
 		mlog_errno(status);
@@ -531,6 +561,14 @@ static int ocfs2_read_locked_inode(struct inode *inode,
 
 	BUG_ON(args->fi_blkno != le64_to_cpu(fe->i_blkno));
 
+	if (buffer_dirty(bh)) {
+		status = ocfs2_write_block(osb, bh, INODE_CACHE(inode));
+		if (status < 0) {
+			mlog_errno(status);
+			goto bail;
+		}
+	}
+
 	status = 0;
 
 bail:
@@ -1385,6 +1423,152 @@ bail:
 	return rc;
 }
 
+static int ocfs2_filecheck_validate_inode_block(struct super_block *sb,
+			       struct buffer_head *bh)
+{
+	int rc = 0;
+	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
+
+	trace_ocfs2_filecheck_validate_inode_block(
+		(unsigned long long)bh->b_blocknr);
+
+	BUG_ON(!buffer_uptodate(bh));
+
+	if (!OCFS2_IS_VALID_DINODE(di)) {
+		mlog(ML_ERROR,
+			"Filecheck: invalid dinode #%llu: signature = %.*s\n",
+			(unsigned long long)bh->b_blocknr, 7, di->i_signature);
+		rc = -OCFS2_FILECHECK_ERR_INVALIDINO;
+		goto bail;
+	}
+
+	rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check);
+	if (rc) {
+		mlog(ML_ERROR,
+			"Filecheck: checksum failed for dinode %llu\n",
+			(unsigned long long)bh->b_blocknr);
+		rc = -OCFS2_FILECHECK_ERR_BLOCKECC;
+		goto bail;
+	}
+
+	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
+		mlog(ML_ERROR,
+			"Filecheck: invalid dinode #%llu: i_blkno is %llu\n",
+			(unsigned long long)bh->b_blocknr,
+			(unsigned long long)le64_to_cpu(di->i_blkno));
+		rc = -OCFS2_FILECHECK_ERR_BLOCKNO;
+		goto bail;
+	}
+
+	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
+		mlog(ML_ERROR,
+			"Filecheck: invalid dinode #%llu: OCFS2_VALID_FL not set\n",
+			(unsigned long long)bh->b_blocknr);
+		rc = -OCFS2_FILECHECK_ERR_VALIDFLAG;
+		goto bail;
+	}
+
+	if (le32_to_cpu(di->i_fs_generation) !=
+	    OCFS2_SB(sb)->fs_generation) {
+		mlog(ML_ERROR,
+			"Filecheck: invalid dinode #%llu: fs_generation is %u\n",
+			(unsigned long long)bh->b_blocknr,
+			le32_to_cpu(di->i_fs_generation));
+		rc = -OCFS2_FILECHECK_ERR_GENERATION;
+		goto bail;
+	}
+
+bail:
+	return rc;
+}
+
+static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
+			       struct buffer_head *bh)
+{
+	int rc;
+	int changed = 0;
+	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
+
+	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
+	/* Can't fix invalid inode block */
+	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
+		return rc;
+
+	trace_ocfs2_filecheck_repair_inode_block(
+		(unsigned long long)bh->b_blocknr);
+
+	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
+		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
+		mlog(ML_ERROR,
+			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
+			(unsigned long long)bh->b_blocknr);
+		return -OCFS2_FILECHECK_ERR_READONLY;
+	}
+
+	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
+		di->i_blkno = cpu_to_le64(bh->b_blocknr);
+		changed = 1;
+		mlog(ML_ERROR,
+			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
+			(unsigned long long)bh->b_blocknr,
+			(unsigned long long)le64_to_cpu(di->i_blkno));
+	}
+
+	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
+		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
+		changed = 1;
+		mlog(ML_ERROR,
+			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
+			(unsigned long long)bh->b_blocknr);
+	}
+
+	if (le32_to_cpu(di->i_fs_generation) !=
+	    OCFS2_SB(sb)->fs_generation) {
+		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
+		changed = 1;
+		mlog(ML_ERROR,
+			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
+			(unsigned long long)bh->b_blocknr,
+			le32_to_cpu(di->i_fs_generation));
+	}
+
+	if (changed ||
+		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
+		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
+		mark_buffer_dirty(bh);
+		mlog(ML_ERROR,
+			"Filecheck: reset dinode #%llu: compute meta ecc\n",
+			(unsigned long long)bh->b_blocknr);
+	}
+
+	return 0;
+}
+
+static int
+ocfs2_filecheck_read_inode_block_full(struct inode *inode,
+		struct buffer_head **bh, int flags, int type)
+{
+	int rc;
+	struct buffer_head *tmp = *bh;
+
+	if (!type) /* Check inode block */
+		rc = ocfs2_read_blocks(INODE_CACHE(inode),
+				OCFS2_I(inode)->ip_blkno,
+				1, &tmp, flags,
+				ocfs2_filecheck_validate_inode_block);
+	else /* Repair inode block */
+		rc = ocfs2_read_blocks(INODE_CACHE(inode),
+				OCFS2_I(inode)->ip_blkno,
+				1, &tmp, flags,
+				ocfs2_filecheck_repair_inode_block);
+
+	/* If ocfs2_read_blocks() got us a new bh, pass it up. */
+	if (!rc && !*bh)
+		*bh = tmp;
+
+	return rc;
+}
+
 int ocfs2_read_inode_block_full(struct inode *inode, struct buffer_head **bh,
 				int flags)
 {
diff --git a/fs/ocfs2/ocfs2_trace.h b/fs/ocfs2/ocfs2_trace.h
index 6cb019b..d9205e0 100644
--- a/fs/ocfs2/ocfs2_trace.h
+++ b/fs/ocfs2/ocfs2_trace.h
@@ -1540,6 +1540,8 @@ DEFINE_OCFS2_ULL_INT_EVENT(ocfs2_read_locked_inode);
 DEFINE_OCFS2_INT_INT_EVENT(ocfs2_check_orphan_recovery_state);
 
 DEFINE_OCFS2_ULL_EVENT(ocfs2_validate_inode_block);
+DEFINE_OCFS2_ULL_EVENT(ocfs2_filecheck_validate_inode_block);
+DEFINE_OCFS2_ULL_EVENT(ocfs2_filecheck_repair_inode_block);
 
 TRACE_EVENT(ocfs2_inode_is_valid_to_delete,
 	TP_PROTO(void *task, void *dc_task, unsigned long long ino,
-- 
2.1.2


^ permalink raw reply related	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-10-28  6:26   ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-28  6:26 UTC (permalink / raw)
  To: mfasheh, rgoldwyn; +Cc: Gang He, linux-kernel, ocfs2-devel, akpm

Implement online check or fix inode block during
reading a inode block to memory.

Signed-off-by: Gang He <ghe@suse.com>
---
 fs/ocfs2/inode.c       | 196 +++++++++++++++++++++++++++++++++++++++++++++++--
 fs/ocfs2/ocfs2_trace.h |   2 +
 2 files changed, 192 insertions(+), 6 deletions(-)

diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index b254416..d811698 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -53,6 +53,7 @@
 #include "xattr.h"
 #include "refcounttree.h"
 #include "ocfs2_trace.h"
+#include "filecheck.h"
 
 #include "buffer_head_io.h"
 
@@ -74,6 +75,13 @@ static int ocfs2_truncate_for_delete(struct ocfs2_super *osb,
 				    struct inode *inode,
 				    struct buffer_head *fe_bh);
 
+static int ocfs2_filecheck_read_inode_block_full(struct inode *inode,
+			struct buffer_head **bh, int flags, int type);
+static int ocfs2_filecheck_validate_inode_block(struct super_block *sb,
+			struct buffer_head *bh);
+static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
+			struct buffer_head *bh);
+
 void ocfs2_set_inode_flags(struct inode *inode)
 {
 	unsigned int flags = OCFS2_I(inode)->ip_attr;
@@ -127,6 +135,7 @@ struct inode *ocfs2_ilookup(struct super_block *sb, u64 blkno)
 struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 			 int sysfile_type)
 {
+	int rc = 0;
 	struct inode *inode = NULL;
 	struct super_block *sb = osb->sb;
 	struct ocfs2_find_inode_args args;
@@ -161,12 +170,17 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 	}
 	trace_ocfs2_iget5_locked(inode->i_state);
 	if (inode->i_state & I_NEW) {
-		ocfs2_read_locked_inode(inode, &args);
+		rc = ocfs2_read_locked_inode(inode, &args);
 		unlock_new_inode(inode);
 	}
 	if (is_bad_inode(inode)) {
 		iput(inode);
-		inode = ERR_PTR(-ESTALE);
+		if ((flags & OCFS2_FI_FLAG_FILECHECK_CHK) ||
+			(flags & OCFS2_FI_FLAG_FILECHECK_FIX))
+			/* Return OCFS2_FILECHECK_ERR_XXX related errno */
+			inode = ERR_PTR(rc);
+		else
+			inode = ERR_PTR(-ESTALE);
 		goto bail;
 	}
 
@@ -494,16 +508,32 @@ static int ocfs2_read_locked_inode(struct inode *inode,
 	}
 
 	if (can_lock) {
-		status = ocfs2_read_inode_block_full(inode, &bh,
-						     OCFS2_BH_IGNORE_CACHE);
+		if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_CHK)
+			status = ocfs2_filecheck_read_inode_block_full(inode,
+						&bh, OCFS2_BH_IGNORE_CACHE, 0);
+		else if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_FIX)
+			status = ocfs2_filecheck_read_inode_block_full(inode,
+						&bh, OCFS2_BH_IGNORE_CACHE, 1);
+		else
+			status = ocfs2_read_inode_block_full(inode,
+						&bh, OCFS2_BH_IGNORE_CACHE);
 	} else {
 		status = ocfs2_read_blocks_sync(osb, args->fi_blkno, 1, &bh);
 		/*
 		 * If buffer is in jbd, then its checksum may not have been
 		 * computed as yet.
 		 */
-		if (!status && !buffer_jbd(bh))
-			status = ocfs2_validate_inode_block(osb->sb, bh);
+		if (!status && !buffer_jbd(bh)) {
+			if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_CHK)
+				status = ocfs2_filecheck_validate_inode_block(
+								osb->sb, bh);
+			else if (args->fi_flags & OCFS2_FI_FLAG_FILECHECK_FIX)
+				status = ocfs2_filecheck_repair_inode_block(
+								osb->sb, bh);
+			else
+				status = ocfs2_validate_inode_block(
+								osb->sb, bh);
+		}
 	}
 	if (status < 0) {
 		mlog_errno(status);
@@ -531,6 +561,14 @@ static int ocfs2_read_locked_inode(struct inode *inode,
 
 	BUG_ON(args->fi_blkno != le64_to_cpu(fe->i_blkno));
 
+	if (buffer_dirty(bh)) {
+		status = ocfs2_write_block(osb, bh, INODE_CACHE(inode));
+		if (status < 0) {
+			mlog_errno(status);
+			goto bail;
+		}
+	}
+
 	status = 0;
 
 bail:
@@ -1385,6 +1423,152 @@ bail:
 	return rc;
 }
 
+static int ocfs2_filecheck_validate_inode_block(struct super_block *sb,
+			       struct buffer_head *bh)
+{
+	int rc = 0;
+	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
+
+	trace_ocfs2_filecheck_validate_inode_block(
+		(unsigned long long)bh->b_blocknr);
+
+	BUG_ON(!buffer_uptodate(bh));
+
+	if (!OCFS2_IS_VALID_DINODE(di)) {
+		mlog(ML_ERROR,
+			"Filecheck: invalid dinode #%llu: signature = %.*s\n",
+			(unsigned long long)bh->b_blocknr, 7, di->i_signature);
+		rc = -OCFS2_FILECHECK_ERR_INVALIDINO;
+		goto bail;
+	}
+
+	rc = ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check);
+	if (rc) {
+		mlog(ML_ERROR,
+			"Filecheck: checksum failed for dinode %llu\n",
+			(unsigned long long)bh->b_blocknr);
+		rc = -OCFS2_FILECHECK_ERR_BLOCKECC;
+		goto bail;
+	}
+
+	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
+		mlog(ML_ERROR,
+			"Filecheck: invalid dinode #%llu: i_blkno is %llu\n",
+			(unsigned long long)bh->b_blocknr,
+			(unsigned long long)le64_to_cpu(di->i_blkno));
+		rc = -OCFS2_FILECHECK_ERR_BLOCKNO;
+		goto bail;
+	}
+
+	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
+		mlog(ML_ERROR,
+			"Filecheck: invalid dinode #%llu: OCFS2_VALID_FL not set\n",
+			(unsigned long long)bh->b_blocknr);
+		rc = -OCFS2_FILECHECK_ERR_VALIDFLAG;
+		goto bail;
+	}
+
+	if (le32_to_cpu(di->i_fs_generation) !=
+	    OCFS2_SB(sb)->fs_generation) {
+		mlog(ML_ERROR,
+			"Filecheck: invalid dinode #%llu: fs_generation is %u\n",
+			(unsigned long long)bh->b_blocknr,
+			le32_to_cpu(di->i_fs_generation));
+		rc = -OCFS2_FILECHECK_ERR_GENERATION;
+		goto bail;
+	}
+
+bail:
+	return rc;
+}
+
+static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
+			       struct buffer_head *bh)
+{
+	int rc;
+	int changed = 0;
+	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
+
+	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
+	/* Can't fix invalid inode block */
+	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
+		return rc;
+
+	trace_ocfs2_filecheck_repair_inode_block(
+		(unsigned long long)bh->b_blocknr);
+
+	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
+		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
+		mlog(ML_ERROR,
+			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
+			(unsigned long long)bh->b_blocknr);
+		return -OCFS2_FILECHECK_ERR_READONLY;
+	}
+
+	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
+		di->i_blkno = cpu_to_le64(bh->b_blocknr);
+		changed = 1;
+		mlog(ML_ERROR,
+			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
+			(unsigned long long)bh->b_blocknr,
+			(unsigned long long)le64_to_cpu(di->i_blkno));
+	}
+
+	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
+		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
+		changed = 1;
+		mlog(ML_ERROR,
+			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
+			(unsigned long long)bh->b_blocknr);
+	}
+
+	if (le32_to_cpu(di->i_fs_generation) !=
+	    OCFS2_SB(sb)->fs_generation) {
+		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
+		changed = 1;
+		mlog(ML_ERROR,
+			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
+			(unsigned long long)bh->b_blocknr,
+			le32_to_cpu(di->i_fs_generation));
+	}
+
+	if (changed ||
+		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
+		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
+		mark_buffer_dirty(bh);
+		mlog(ML_ERROR,
+			"Filecheck: reset dinode #%llu: compute meta ecc\n",
+			(unsigned long long)bh->b_blocknr);
+	}
+
+	return 0;
+}
+
+static int
+ocfs2_filecheck_read_inode_block_full(struct inode *inode,
+		struct buffer_head **bh, int flags, int type)
+{
+	int rc;
+	struct buffer_head *tmp = *bh;
+
+	if (!type) /* Check inode block */
+		rc = ocfs2_read_blocks(INODE_CACHE(inode),
+				OCFS2_I(inode)->ip_blkno,
+				1, &tmp, flags,
+				ocfs2_filecheck_validate_inode_block);
+	else /* Repair inode block */
+		rc = ocfs2_read_blocks(INODE_CACHE(inode),
+				OCFS2_I(inode)->ip_blkno,
+				1, &tmp, flags,
+				ocfs2_filecheck_repair_inode_block);
+
+	/* If ocfs2_read_blocks() got us a new bh, pass it up. */
+	if (!rc && !*bh)
+		*bh = tmp;
+
+	return rc;
+}
+
 int ocfs2_read_inode_block_full(struct inode *inode, struct buffer_head **bh,
 				int flags)
 {
diff --git a/fs/ocfs2/ocfs2_trace.h b/fs/ocfs2/ocfs2_trace.h
index 6cb019b..d9205e0 100644
--- a/fs/ocfs2/ocfs2_trace.h
+++ b/fs/ocfs2/ocfs2_trace.h
@@ -1540,6 +1540,8 @@ DEFINE_OCFS2_ULL_INT_EVENT(ocfs2_read_locked_inode);
 DEFINE_OCFS2_INT_INT_EVENT(ocfs2_check_orphan_recovery_state);
 
 DEFINE_OCFS2_ULL_EVENT(ocfs2_validate_inode_block);
+DEFINE_OCFS2_ULL_EVENT(ocfs2_filecheck_validate_inode_block);
+DEFINE_OCFS2_ULL_EVENT(ocfs2_filecheck_repair_inode_block);
 
 TRACE_EVENT(ocfs2_inode_is_valid_to_delete,
 	TP_PROTO(void *task, void *dc_task, unsigned long long ino,
-- 
2.1.2

^ permalink raw reply related	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
  2015-10-28  6:25 ` [Ocfs2-devel] " Gang He
@ 2015-10-28 16:34   ` Srinivas Eeda
  -1 siblings, 0 replies; 80+ messages in thread
From: Srinivas Eeda @ 2015-10-28 16:34 UTC (permalink / raw)
  To: Gang He, mfasheh, rgoldwyn; +Cc: linux-kernel, ocfs2-devel

Hi Gang,

thank you for implementing this. I would like to understand this better 
on where and how it helps ... would you mind sharing couple 
examples(real scenarios).

Thanks,
--Srini


On 10/27/2015 11:25 PM, Gang He wrote:
> When there are errors in the ocfs2 filesystem,
> they are usually accompanied by the inode number which caused the error.
> This inode number would be the input to fixing the file.
> One of these options could be considered:
> A file in the sys filesytem which would accept inode numbers.
> This could be used to communication back what has to be fixed or is fixed.
> You could write:
> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> or
> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>
> Compare with first version, I use strncasecmp instead of double strncmp
> functions. Second, update the source file contribution vendor.
>
> Gang He (4):
>    ocfs2: export ocfs2_kset for online file check
>    ocfs2: sysfile interfaces for online file check
>    ocfs2: create/remove sysfile for online file check
>    ocfs2: check/fix inode block for online file check
>
>   fs/ocfs2/Makefile      |   3 +-
>   fs/ocfs2/filecheck.c   | 566 +++++++++++++++++++++++++++++++++++++++++++++++++
>   fs/ocfs2/filecheck.h   |  48 +++++
>   fs/ocfs2/inode.c       | 196 ++++++++++++++++-
>   fs/ocfs2/inode.h       |   3 +
>   fs/ocfs2/ocfs2_trace.h |   2 +
>   fs/ocfs2/stackglue.c   |   3 +-
>   fs/ocfs2/stackglue.h   |   2 +
>   fs/ocfs2/super.c       |   5 +
>   9 files changed, 820 insertions(+), 8 deletions(-)
>   create mode 100644 fs/ocfs2/filecheck.c
>   create mode 100644 fs/ocfs2/filecheck.h
>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-10-28 16:34   ` Srinivas Eeda
  0 siblings, 0 replies; 80+ messages in thread
From: Srinivas Eeda @ 2015-10-28 16:34 UTC (permalink / raw)
  To: Gang He, mfasheh, rgoldwyn; +Cc: linux-kernel, ocfs2-devel

Hi Gang,

thank you for implementing this. I would like to understand this better 
on where and how it helps ... would you mind sharing couple 
examples(real scenarios).

Thanks,
--Srini


On 10/27/2015 11:25 PM, Gang He wrote:
> When there are errors in the ocfs2 filesystem,
> they are usually accompanied by the inode number which caused the error.
> This inode number would be the input to fixing the file.
> One of these options could be considered:
> A file in the sys filesytem which would accept inode numbers.
> This could be used to communication back what has to be fixed or is fixed.
> You could write:
> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> or
> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>
> Compare with first version, I use strncasecmp instead of double strncmp
> functions. Second, update the source file contribution vendor.
>
> Gang He (4):
>    ocfs2: export ocfs2_kset for online file check
>    ocfs2: sysfile interfaces for online file check
>    ocfs2: create/remove sysfile for online file check
>    ocfs2: check/fix inode block for online file check
>
>   fs/ocfs2/Makefile      |   3 +-
>   fs/ocfs2/filecheck.c   | 566 +++++++++++++++++++++++++++++++++++++++++++++++++
>   fs/ocfs2/filecheck.h   |  48 +++++
>   fs/ocfs2/inode.c       | 196 ++++++++++++++++-
>   fs/ocfs2/inode.h       |   3 +
>   fs/ocfs2/ocfs2_trace.h |   2 +
>   fs/ocfs2/stackglue.c   |   3 +-
>   fs/ocfs2/stackglue.h   |   2 +
>   fs/ocfs2/super.c       |   5 +
>   9 files changed, 820 insertions(+), 8 deletions(-)
>   create mode 100644 fs/ocfs2/filecheck.c
>   create mode 100644 fs/ocfs2/filecheck.h
>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
  2015-10-28 16:34   ` Srinivas Eeda
@ 2015-10-29  4:44     ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-29  4:44 UTC (permalink / raw)
  To: srinivas.eeda, Mark Fasheh, rgoldwyn; +Cc: ocfs2-devel, linux-kernel

Hello Srini,

There is a doc about ocfs2 online file check.

OCFS2 online file check
-----------------------

This document will describe OCFS2 online file check feature.

Introduction
============
OCFS2 is often used in high-availaibility systems. However, OCFS2 usually
converts the filesystem to read-only on errors. This may not be necessary, since
turning the filesystem read-only would affect other running processes as well,
decreasing availability. Then, a mount option (errors=continue) was introduced,
which would return the EIO to the calling process and terminate furhter
processing so that the filesystem is not corrupted further. So,the filesystem is
not converted to read-only, and the problematic file's inode number is reported
in the kernel log so that the user can try to check/fix this file via online
filecheck feature.

Scope
=====
This effort is to check/fix small issues which may hinder day-to-day operations
of a cluster filesystem by turning the filesystem read-only. The scope of
checking/fixing is at the file level, initially for regular files and eventually
to all files (including system files) of the filesystem.

In case of directory to file links is incorrect, the directory inode is
reported as erroneous.

This feature is not suited for extravagant checks which involve dependency of
other components of the filesystem, such as but not limited to, checking if the
bits for file blocks in the allocation has been set. In case of such an error,
the offline fsck should/would be recommended.

Finally, such an operation/feature should not be automated lest the filesystem
may end up with more damage than before the repair attempt. So, this has to
be performed using user interaction and consent.

User interface
==============
When there are errors in the OCFS2 filesystem, they are usually accompanied
by the inode number which caused the error. This inode number would be the
input to check/fix the file.

There is a sysfs file for each OCFS2 file system mounting:

  /sys/fs/ocfs2/<devname>/filecheck

Here, <devname> indicates the name of OCFS2 volumn device which has been already
mounted. The file above would accept inode numbers. This could be used to
communicate with kernel space, tell which file(inode number) will be checked or
fixed. Currently, three operations are supported, which includes checking
inode, fixing inode and setting the size of result record history.

1. If you want to know what error exactly happened to <inode> before fixing, do

  # echo "CHECK <inode>" > /sys/fs/ocfs2/<devname>/filecheck
  # cat /sys/fs/ocfs2/<devname>/filecheck

The output is like this:
  INO		TYPE		DONE		ERROR
39502		0		1		GENERATION

<INO> lists the inode numbers.
<TYPE> is what kind of operation you've done, 0 for inode check,1 for inode fix.
<DONE> 	indicates whether the operation has been finished.
<ERROR> says what kind of errors was found. For the details, please refer to the
file linux/fs/ocfs2/filecheck.h.

2. If you determine to fix this inode, do

  # echo "FIX <inode>" > /sys/fs/ocfs2/<devname>/filecheck
  # cat /sys/fs/ocfs2/<devname>/filecheck

The output is like this:
  INO		TYPE		DONE		ERROR
39502		1		1		SUCCESS

This time, the <ERROR> column indicates whether this fix is successful or not.

3. The record cache is used to store the history of check/fix result. Its
defalut size is 10, and can be adjust between the range of 10 ~ 100. You can
adjust the size like this:

  # echo "SET <size>" > /sys/fs/ocfs2/<devname>/filecheck

Fixing stuff
============
On receivng the inode, the filesystem would read the inode and the
file metadata. In case of errors, the filesystem would fix the errors
and report the problems it fixed in the kernel log. As a precautionary measure,
the inode must first be checked for errors before performing a final fix.

The inode and the fix history will be maintained temporarily in a
small linked list buffer which would contain the last (N) inodes
fixed/checked, along with the logs of what errors were reported/fixed.

Thanks
Gang


>>> 
> Hi Gang,
> 
> thank you for implementing this. I would like to understand this better 
> on where and how it helps ... would you mind sharing couple 
> examples(real scenarios).
> 
> Thanks,
> --Srini
> 
> 
> On 10/27/2015 11:25 PM, Gang He wrote:
>> When there are errors in the ocfs2 filesystem,
>> they are usually accompanied by the inode number which caused the error.
>> This inode number would be the input to fixing the file.
>> One of these options could be considered:
>> A file in the sys filesytem which would accept inode numbers.
>> This could be used to communication back what has to be fixed or is fixed.
>> You could write:
>> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>> or
>> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>>
>> Compare with first version, I use strncasecmp instead of double strncmp
>> functions. Second, update the source file contribution vendor.
>>
>> Gang He (4):
>>    ocfs2: export ocfs2_kset for online file check
>>    ocfs2: sysfile interfaces for online file check
>>    ocfs2: create/remove sysfile for online file check
>>    ocfs2: check/fix inode block for online file check
>>
>>   fs/ocfs2/Makefile      |   3 +-
>>   fs/ocfs2/filecheck.c   | 566 
> +++++++++++++++++++++++++++++++++++++++++++++++++
>>   fs/ocfs2/filecheck.h   |  48 +++++
>>   fs/ocfs2/inode.c       | 196 ++++++++++++++++-
>>   fs/ocfs2/inode.h       |   3 +
>>   fs/ocfs2/ocfs2_trace.h |   2 +
>>   fs/ocfs2/stackglue.c   |   3 +-
>>   fs/ocfs2/stackglue.h   |   2 +
>>   fs/ocfs2/super.c       |   5 +
>>   9 files changed, 820 insertions(+), 8 deletions(-)
>>   create mode 100644 fs/ocfs2/filecheck.c
>>   create mode 100644 fs/ocfs2/filecheck.h
>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-10-29  4:44     ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-29  4:44 UTC (permalink / raw)
  To: srinivas.eeda, Mark Fasheh, rgoldwyn; +Cc: ocfs2-devel, linux-kernel

Hello Srini,

There is a doc about ocfs2 online file check.

OCFS2 online file check
-----------------------

This document will describe OCFS2 online file check feature.

Introduction
============
OCFS2 is often used in high-availaibility systems. However, OCFS2 usually
converts the filesystem to read-only on errors. This may not be necessary, since
turning the filesystem read-only would affect other running processes as well,
decreasing availability. Then, a mount option (errors=continue) was introduced,
which would return the EIO to the calling process and terminate furhter
processing so that the filesystem is not corrupted further. So,the filesystem is
not converted to read-only, and the problematic file's inode number is reported
in the kernel log so that the user can try to check/fix this file via online
filecheck feature.

Scope
=====
This effort is to check/fix small issues which may hinder day-to-day operations
of a cluster filesystem by turning the filesystem read-only. The scope of
checking/fixing is at the file level, initially for regular files and eventually
to all files (including system files) of the filesystem.

In case of directory to file links is incorrect, the directory inode is
reported as erroneous.

This feature is not suited for extravagant checks which involve dependency of
other components of the filesystem, such as but not limited to, checking if the
bits for file blocks in the allocation has been set. In case of such an error,
the offline fsck should/would be recommended.

Finally, such an operation/feature should not be automated lest the filesystem
may end up with more damage than before the repair attempt. So, this has to
be performed using user interaction and consent.

User interface
==============
When there are errors in the OCFS2 filesystem, they are usually accompanied
by the inode number which caused the error. This inode number would be the
input to check/fix the file.

There is a sysfs file for each OCFS2 file system mounting:

  /sys/fs/ocfs2/<devname>/filecheck

Here, <devname> indicates the name of OCFS2 volumn device which has been already
mounted. The file above would accept inode numbers. This could be used to
communicate with kernel space, tell which file(inode number) will be checked or
fixed. Currently, three operations are supported, which includes checking
inode, fixing inode and setting the size of result record history.

1. If you want to know what error exactly happened to <inode> before fixing, do

  # echo "CHECK <inode>" > /sys/fs/ocfs2/<devname>/filecheck
  # cat /sys/fs/ocfs2/<devname>/filecheck

The output is like this:
  INO		TYPE		DONE		ERROR
39502		0		1		GENERATION

<INO> lists the inode numbers.
<TYPE> is what kind of operation you've done, 0 for inode check,1 for inode fix.
<DONE> 	indicates whether the operation has been finished.
<ERROR> says what kind of errors was found. For the details, please refer to the
file linux/fs/ocfs2/filecheck.h.

2. If you determine to fix this inode, do

  # echo "FIX <inode>" > /sys/fs/ocfs2/<devname>/filecheck
  # cat /sys/fs/ocfs2/<devname>/filecheck

The output is like this:
  INO		TYPE		DONE		ERROR
39502		1		1		SUCCESS

This time, the <ERROR> column indicates whether this fix is successful or not.

3. The record cache is used to store the history of check/fix result. Its
defalut size is 10, and can be adjust between the range of 10 ~ 100. You can
adjust the size like this:

  # echo "SET <size>" > /sys/fs/ocfs2/<devname>/filecheck

Fixing stuff
============
On receivng the inode, the filesystem would read the inode and the
file metadata. In case of errors, the filesystem would fix the errors
and report the problems it fixed in the kernel log. As a precautionary measure,
the inode must first be checked for errors before performing a final fix.

The inode and the fix history will be maintained temporarily in a
small linked list buffer which would contain the last (N) inodes
fixed/checked, along with the logs of what errors were reported/fixed.

Thanks
Gang


>>> 
> Hi Gang,
> 
> thank you for implementing this. I would like to understand this better 
> on where and how it helps ... would you mind sharing couple 
> examples(real scenarios).
> 
> Thanks,
> --Srini
> 
> 
> On 10/27/2015 11:25 PM, Gang He wrote:
>> When there are errors in the ocfs2 filesystem,
>> they are usually accompanied by the inode number which caused the error.
>> This inode number would be the input to fixing the file.
>> One of these options could be considered:
>> A file in the sys filesytem which would accept inode numbers.
>> This could be used to communication back what has to be fixed or is fixed.
>> You could write:
>> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>> or
>> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>>
>> Compare with first version, I use strncasecmp instead of double strncmp
>> functions. Second, update the source file contribution vendor.
>>
>> Gang He (4):
>>    ocfs2: export ocfs2_kset for online file check
>>    ocfs2: sysfile interfaces for online file check
>>    ocfs2: create/remove sysfile for online file check
>>    ocfs2: check/fix inode block for online file check
>>
>>   fs/ocfs2/Makefile      |   3 +-
>>   fs/ocfs2/filecheck.c   | 566 
> +++++++++++++++++++++++++++++++++++++++++++++++++
>>   fs/ocfs2/filecheck.h   |  48 +++++
>>   fs/ocfs2/inode.c       | 196 ++++++++++++++++-
>>   fs/ocfs2/inode.h       |   3 +
>>   fs/ocfs2/ocfs2_trace.h |   2 +
>>   fs/ocfs2/stackglue.c   |   3 +-
>>   fs/ocfs2/stackglue.h   |   2 +
>>   fs/ocfs2/super.c       |   5 +
>>   9 files changed, 820 insertions(+), 8 deletions(-)
>>   create mode 100644 fs/ocfs2/filecheck.c
>>   create mode 100644 fs/ocfs2/filecheck.h
>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
  2015-10-29  4:44     ` Gang He
@ 2015-10-29  7:46       ` Srinivas Eeda
  -1 siblings, 0 replies; 80+ messages in thread
From: Srinivas Eeda @ 2015-10-29  7:46 UTC (permalink / raw)
  To: Gang He, Mark Fasheh, rgoldwyn; +Cc: ocfs2-devel, linux-kernel

Hi Gang,

thanks for pointing to explanation of the feature.

What I am curious about is ... what were the real cases that you came 
across prompted this change and how this change would help in that case.

Thanks,
--Srini


On 10/28/2015 09:44 PM, Gang He wrote:
> Hello Srini,
>
> There is a doc about ocfs2 online file check.
>
> OCFS2 online file check
> -----------------------
>
> This document will describe OCFS2 online file check feature.
>
> Introduction
> ============
> OCFS2 is often used in high-availaibility systems. However, OCFS2 usually
> converts the filesystem to read-only on errors. This may not be necessary, since
> turning the filesystem read-only would affect other running processes as well,
> decreasing availability. Then, a mount option (errors=continue) was introduced,
> which would return the EIO to the calling process and terminate furhter
> processing so that the filesystem is not corrupted further. So,the filesystem is
> not converted to read-only, and the problematic file's inode number is reported
> in the kernel log so that the user can try to check/fix this file via online
> filecheck feature.
>
> Scope
> =====
> This effort is to check/fix small issues which may hinder day-to-day operations
> of a cluster filesystem by turning the filesystem read-only. The scope of
> checking/fixing is at the file level, initially for regular files and eventually
> to all files (including system files) of the filesystem.
>
> In case of directory to file links is incorrect, the directory inode is
> reported as erroneous.
>
> This feature is not suited for extravagant checks which involve dependency of
> other components of the filesystem, such as but not limited to, checking if the
> bits for file blocks in the allocation has been set. In case of such an error,
> the offline fsck should/would be recommended.
>
> Finally, such an operation/feature should not be automated lest the filesystem
> may end up with more damage than before the repair attempt. So, this has to
> be performed using user interaction and consent.
>
> User interface
> ==============
> When there are errors in the OCFS2 filesystem, they are usually accompanied
> by the inode number which caused the error. This inode number would be the
> input to check/fix the file.
>
> There is a sysfs file for each OCFS2 file system mounting:
>
>    /sys/fs/ocfs2/<devname>/filecheck
>
> Here, <devname> indicates the name of OCFS2 volumn device which has been already
> mounted. The file above would accept inode numbers. This could be used to
> communicate with kernel space, tell which file(inode number) will be checked or
> fixed. Currently, three operations are supported, which includes checking
> inode, fixing inode and setting the size of result record history.
>
> 1. If you want to know what error exactly happened to <inode> before fixing, do
>
>    # echo "CHECK <inode>" > /sys/fs/ocfs2/<devname>/filecheck
>    # cat /sys/fs/ocfs2/<devname>/filecheck
>
> The output is like this:
>    INO		TYPE		DONE		ERROR
> 39502		0		1		GENERATION
>
> <INO> lists the inode numbers.
> <TYPE> is what kind of operation you've done, 0 for inode check,1 for inode fix.
> <DONE> 	indicates whether the operation has been finished.
> <ERROR> says what kind of errors was found. For the details, please refer to the
> file linux/fs/ocfs2/filecheck.h.
>
> 2. If you determine to fix this inode, do
>
>    # echo "FIX <inode>" > /sys/fs/ocfs2/<devname>/filecheck
>    # cat /sys/fs/ocfs2/<devname>/filecheck
>
> The output is like this:
>    INO		TYPE		DONE		ERROR
> 39502		1		1		SUCCESS
>
> This time, the <ERROR> column indicates whether this fix is successful or not.
>
> 3. The record cache is used to store the history of check/fix result. Its
> defalut size is 10, and can be adjust between the range of 10 ~ 100. You can
> adjust the size like this:
>
>    # echo "SET <size>" > /sys/fs/ocfs2/<devname>/filecheck
>
> Fixing stuff
> ============
> On receivng the inode, the filesystem would read the inode and the
> file metadata. In case of errors, the filesystem would fix the errors
> and report the problems it fixed in the kernel log. As a precautionary measure,
> the inode must first be checked for errors before performing a final fix.
>
> The inode and the fix history will be maintained temporarily in a
> small linked list buffer which would contain the last (N) inodes
> fixed/checked, along with the logs of what errors were reported/fixed.
>
> Thanks
> Gang
>
>
>> Hi Gang,
>>
>> thank you for implementing this. I would like to understand this better
>> on where and how it helps ... would you mind sharing couple
>> examples(real scenarios).
>>
>> Thanks,
>> --Srini
>>
>>
>> On 10/27/2015 11:25 PM, Gang He wrote:
>>> When there are errors in the ocfs2 filesystem,
>>> they are usually accompanied by the inode number which caused the error.
>>> This inode number would be the input to fixing the file.
>>> One of these options could be considered:
>>> A file in the sys filesytem which would accept inode numbers.
>>> This could be used to communication back what has to be fixed or is fixed.
>>> You could write:
>>> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>>> or
>>> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>>>
>>> Compare with first version, I use strncasecmp instead of double strncmp
>>> functions. Second, update the source file contribution vendor.
>>>
>>> Gang He (4):
>>>     ocfs2: export ocfs2_kset for online file check
>>>     ocfs2: sysfile interfaces for online file check
>>>     ocfs2: create/remove sysfile for online file check
>>>     ocfs2: check/fix inode block for online file check
>>>
>>>    fs/ocfs2/Makefile      |   3 +-
>>>    fs/ocfs2/filecheck.c   | 566
>> +++++++++++++++++++++++++++++++++++++++++++++++++
>>>    fs/ocfs2/filecheck.h   |  48 +++++
>>>    fs/ocfs2/inode.c       | 196 ++++++++++++++++-
>>>    fs/ocfs2/inode.h       |   3 +
>>>    fs/ocfs2/ocfs2_trace.h |   2 +
>>>    fs/ocfs2/stackglue.c   |   3 +-
>>>    fs/ocfs2/stackglue.h   |   2 +
>>>    fs/ocfs2/super.c       |   5 +
>>>    9 files changed, 820 insertions(+), 8 deletions(-)
>>>    create mode 100644 fs/ocfs2/filecheck.c
>>>    create mode 100644 fs/ocfs2/filecheck.h
>>>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-10-29  7:46       ` Srinivas Eeda
  0 siblings, 0 replies; 80+ messages in thread
From: Srinivas Eeda @ 2015-10-29  7:46 UTC (permalink / raw)
  To: Gang He, Mark Fasheh, rgoldwyn; +Cc: ocfs2-devel, linux-kernel

Hi Gang,

thanks for pointing to explanation of the feature.

What I am curious about is ... what were the real cases that you came 
across prompted this change and how this change would help in that case.

Thanks,
--Srini


On 10/28/2015 09:44 PM, Gang He wrote:
> Hello Srini,
>
> There is a doc about ocfs2 online file check.
>
> OCFS2 online file check
> -----------------------
>
> This document will describe OCFS2 online file check feature.
>
> Introduction
> ============
> OCFS2 is often used in high-availaibility systems. However, OCFS2 usually
> converts the filesystem to read-only on errors. This may not be necessary, since
> turning the filesystem read-only would affect other running processes as well,
> decreasing availability. Then, a mount option (errors=continue) was introduced,
> which would return the EIO to the calling process and terminate furhter
> processing so that the filesystem is not corrupted further. So,the filesystem is
> not converted to read-only, and the problematic file's inode number is reported
> in the kernel log so that the user can try to check/fix this file via online
> filecheck feature.
>
> Scope
> =====
> This effort is to check/fix small issues which may hinder day-to-day operations
> of a cluster filesystem by turning the filesystem read-only. The scope of
> checking/fixing is at the file level, initially for regular files and eventually
> to all files (including system files) of the filesystem.
>
> In case of directory to file links is incorrect, the directory inode is
> reported as erroneous.
>
> This feature is not suited for extravagant checks which involve dependency of
> other components of the filesystem, such as but not limited to, checking if the
> bits for file blocks in the allocation has been set. In case of such an error,
> the offline fsck should/would be recommended.
>
> Finally, such an operation/feature should not be automated lest the filesystem
> may end up with more damage than before the repair attempt. So, this has to
> be performed using user interaction and consent.
>
> User interface
> ==============
> When there are errors in the OCFS2 filesystem, they are usually accompanied
> by the inode number which caused the error. This inode number would be the
> input to check/fix the file.
>
> There is a sysfs file for each OCFS2 file system mounting:
>
>    /sys/fs/ocfs2/<devname>/filecheck
>
> Here, <devname> indicates the name of OCFS2 volumn device which has been already
> mounted. The file above would accept inode numbers. This could be used to
> communicate with kernel space, tell which file(inode number) will be checked or
> fixed. Currently, three operations are supported, which includes checking
> inode, fixing inode and setting the size of result record history.
>
> 1. If you want to know what error exactly happened to <inode> before fixing, do
>
>    # echo "CHECK <inode>" > /sys/fs/ocfs2/<devname>/filecheck
>    # cat /sys/fs/ocfs2/<devname>/filecheck
>
> The output is like this:
>    INO		TYPE		DONE		ERROR
> 39502		0		1		GENERATION
>
> <INO> lists the inode numbers.
> <TYPE> is what kind of operation you've done, 0 for inode check,1 for inode fix.
> <DONE> 	indicates whether the operation has been finished.
> <ERROR> says what kind of errors was found. For the details, please refer to the
> file linux/fs/ocfs2/filecheck.h.
>
> 2. If you determine to fix this inode, do
>
>    # echo "FIX <inode>" > /sys/fs/ocfs2/<devname>/filecheck
>    # cat /sys/fs/ocfs2/<devname>/filecheck
>
> The output is like this:
>    INO		TYPE		DONE		ERROR
> 39502		1		1		SUCCESS
>
> This time, the <ERROR> column indicates whether this fix is successful or not.
>
> 3. The record cache is used to store the history of check/fix result. Its
> defalut size is 10, and can be adjust between the range of 10 ~ 100. You can
> adjust the size like this:
>
>    # echo "SET <size>" > /sys/fs/ocfs2/<devname>/filecheck
>
> Fixing stuff
> ============
> On receivng the inode, the filesystem would read the inode and the
> file metadata. In case of errors, the filesystem would fix the errors
> and report the problems it fixed in the kernel log. As a precautionary measure,
> the inode must first be checked for errors before performing a final fix.
>
> The inode and the fix history will be maintained temporarily in a
> small linked list buffer which would contain the last (N) inodes
> fixed/checked, along with the logs of what errors were reported/fixed.
>
> Thanks
> Gang
>
>
>> Hi Gang,
>>
>> thank you for implementing this. I would like to understand this better
>> on where and how it helps ... would you mind sharing couple
>> examples(real scenarios).
>>
>> Thanks,
>> --Srini
>>
>>
>> On 10/27/2015 11:25 PM, Gang He wrote:
>>> When there are errors in the ocfs2 filesystem,
>>> they are usually accompanied by the inode number which caused the error.
>>> This inode number would be the input to fixing the file.
>>> One of these options could be considered:
>>> A file in the sys filesytem which would accept inode numbers.
>>> This could be used to communication back what has to be fixed or is fixed.
>>> You could write:
>>> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>>> or
>>> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>>>
>>> Compare with first version, I use strncasecmp instead of double strncmp
>>> functions. Second, update the source file contribution vendor.
>>>
>>> Gang He (4):
>>>     ocfs2: export ocfs2_kset for online file check
>>>     ocfs2: sysfile interfaces for online file check
>>>     ocfs2: create/remove sysfile for online file check
>>>     ocfs2: check/fix inode block for online file check
>>>
>>>    fs/ocfs2/Makefile      |   3 +-
>>>    fs/ocfs2/filecheck.c   | 566
>> +++++++++++++++++++++++++++++++++++++++++++++++++
>>>    fs/ocfs2/filecheck.h   |  48 +++++
>>>    fs/ocfs2/inode.c       | 196 ++++++++++++++++-
>>>    fs/ocfs2/inode.h       |   3 +
>>>    fs/ocfs2/ocfs2_trace.h |   2 +
>>>    fs/ocfs2/stackglue.c   |   3 +-
>>>    fs/ocfs2/stackglue.h   |   2 +
>>>    fs/ocfs2/super.c       |   5 +
>>>    9 files changed, 820 insertions(+), 8 deletions(-)
>>>    create mode 100644 fs/ocfs2/filecheck.c
>>>    create mode 100644 fs/ocfs2/filecheck.h
>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
  2015-10-29  7:46       ` Srinivas Eeda
@ 2015-10-29  8:26         ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-29  8:26 UTC (permalink / raw)
  To: srinivas.eeda, Mark Fasheh, rgoldwyn; +Cc: ocfs2-devel, linux-kernel

Hello Srini,

The real cases are that we try to fix some independent issues without turning the file system off-line (error=continue was introduced).
You know, the online file check feature is used for fixing some independent or light meta-data block corruption, e.g. inode block, file extent block, dir entry block, etc.
These corruptions are usually like checksum error, blk number inconsistency, etc. 

Thanks
Gang  


-- 
<The HTML signature 'New Signature' does not contain any text>


>>> 
> Hi Gang,
> 
> thanks for pointing to explanation of the feature.
> 
> What I am curious about is ... what were the real cases that you came 
> across prompted this change and how this change would help in that case.
> 
> Thanks,
> --Srini
> 
> 
> On 10/28/2015 09:44 PM, Gang He wrote:
>> Hello Srini,
>>
>> There is a doc about ocfs2 online file check.
>>
>> OCFS2 online file check
>> -----------------------
>>
>> This document will describe OCFS2 online file check feature.
>>
>> Introduction
>> ============
>> OCFS2 is often used in high-availaibility systems. However, OCFS2 usually
>> converts the filesystem to read-only on errors. This may not be necessary, 
> since
>> turning the filesystem read-only would affect other running processes as 
> well,
>> decreasing availability. Then, a mount option (errors=continue) was 
> introduced,
>> which would return the EIO to the calling process and terminate furhter
>> processing so that the filesystem is not corrupted further. So,the 
> filesystem is
>> not converted to read-only, and the problematic file's inode number is 
> reported
>> in the kernel log so that the user can try to check/fix this file via online
>> filecheck feature.
>>
>> Scope
>> =====
>> This effort is to check/fix small issues which may hinder day-to-day 
> operations
>> of a cluster filesystem by turning the filesystem read-only. The scope of
>> checking/fixing is at the file level, initially for regular files and 
> eventually
>> to all files (including system files) of the filesystem.
>>
>> In case of directory to file links is incorrect, the directory inode is
>> reported as erroneous.
>>
>> This feature is not suited for extravagant checks which involve dependency 
> of
>> other components of the filesystem, such as but not limited to, checking if 
> the
>> bits for file blocks in the allocation has been set. In case of such an 
> error,
>> the offline fsck should/would be recommended.
>>
>> Finally, such an operation/feature should not be automated lest the 
> filesystem
>> may end up with more damage than before the repair attempt. So, this has to
>> be performed using user interaction and consent.
>>
>> User interface
>> ==============
>> When there are errors in the OCFS2 filesystem, they are usually accompanied
>> by the inode number which caused the error. This inode number would be the
>> input to check/fix the file.
>>
>> There is a sysfs file for each OCFS2 file system mounting:
>>
>>    /sys/fs/ocfs2/<devname>/filecheck
>>
>> Here, <devname> indicates the name of OCFS2 volumn device which has been 
> already
>> mounted. The file above would accept inode numbers. This could be used to
>> communicate with kernel space, tell which file(inode number) will be checked 
> or
>> fixed. Currently, three operations are supported, which includes checking
>> inode, fixing inode and setting the size of result record history.
>>
>> 1. If you want to know what error exactly happened to <inode> before fixing, 
> do
>>
>>    # echo "CHECK <inode>" > /sys/fs/ocfs2/<devname>/filecheck
>>    # cat /sys/fs/ocfs2/<devname>/filecheck
>>
>> The output is like this:
>>    INO		TYPE		DONE		ERROR
>> 39502		0		1		GENERATION
>>
>> <INO> lists the inode numbers.
>> <TYPE> is what kind of operation you've done, 0 for inode check,1 for inode 
> fix.
>> <DONE> 	indicates whether the operation has been finished.
>> <ERROR> says what kind of errors was found. For the details, please refer to 
> the
>> file linux/fs/ocfs2/filecheck.h.
>>
>> 2. If you determine to fix this inode, do
>>
>>    # echo "FIX <inode>" > /sys/fs/ocfs2/<devname>/filecheck
>>    # cat /sys/fs/ocfs2/<devname>/filecheck
>>
>> The output is like this:
>>    INO		TYPE		DONE		ERROR
>> 39502		1		1		SUCCESS
>>
>> This time, the <ERROR> column indicates whether this fix is successful or not.
>>
>> 3. The record cache is used to store the history of check/fix result. Its
>> defalut size is 10, and can be adjust between the range of 10 ~ 100. You can
>> adjust the size like this:
>>
>>    # echo "SET <size>" > /sys/fs/ocfs2/<devname>/filecheck
>>
>> Fixing stuff
>> ============
>> On receivng the inode, the filesystem would read the inode and the
>> file metadata. In case of errors, the filesystem would fix the errors
>> and report the problems it fixed in the kernel log. As a precautionary 
> measure,
>> the inode must first be checked for errors before performing a final fix.
>>
>> The inode and the fix history will be maintained temporarily in a
>> small linked list buffer which would contain the last (N) inodes
>> fixed/checked, along with the logs of what errors were reported/fixed.
>>
>> Thanks
>> Gang
>>
>>
>>> Hi Gang,
>>>
>>> thank you for implementing this. I would like to understand this better
>>> on where and how it helps ... would you mind sharing couple
>>> examples(real scenarios).
>>>
>>> Thanks,
>>> --Srini
>>>
>>>
>>> On 10/27/2015 11:25 PM, Gang He wrote:
>>>> When there are errors in the ocfs2 filesystem,
>>>> they are usually accompanied by the inode number which caused the error.
>>>> This inode number would be the input to fixing the file.
>>>> One of these options could be considered:
>>>> A file in the sys filesytem which would accept inode numbers.
>>>> This could be used to communication back what has to be fixed or is fixed.
>>>> You could write:
>>>> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>>>> or
>>>> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>>>>
>>>> Compare with first version, I use strncasecmp instead of double strncmp
>>>> functions. Second, update the source file contribution vendor.
>>>>
>>>> Gang He (4):
>>>>     ocfs2: export ocfs2_kset for online file check
>>>>     ocfs2: sysfile interfaces for online file check
>>>>     ocfs2: create/remove sysfile for online file check
>>>>     ocfs2: check/fix inode block for online file check
>>>>
>>>>    fs/ocfs2/Makefile      |   3 +-
>>>>    fs/ocfs2/filecheck.c   | 566
>>> +++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    fs/ocfs2/filecheck.h   |  48 +++++
>>>>    fs/ocfs2/inode.c       | 196 ++++++++++++++++-
>>>>    fs/ocfs2/inode.h       |   3 +
>>>>    fs/ocfs2/ocfs2_trace.h |   2 +
>>>>    fs/ocfs2/stackglue.c   |   3 +-
>>>>    fs/ocfs2/stackglue.h   |   2 +
>>>>    fs/ocfs2/super.c       |   5 +
>>>>    9 files changed, 820 insertions(+), 8 deletions(-)
>>>>    create mode 100644 fs/ocfs2/filecheck.c
>>>>    create mode 100644 fs/ocfs2/filecheck.h
>>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-10-29  8:26         ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-10-29  8:26 UTC (permalink / raw)
  To: srinivas.eeda, Mark Fasheh, rgoldwyn; +Cc: ocfs2-devel, linux-kernel

Hello Srini,

The real cases are that we try to fix some independent issues without turning the file system off-line (error=continue was introduced).
You know, the online file check feature is used for fixing some independent or light meta-data block corruption, e.g. inode block, file extent block, dir entry block, etc.
These corruptions are usually like checksum error, blk number inconsistency, etc. 

Thanks
Gang  


-- 
<The HTML signature 'New Signature' does not contain any text>


>>> 
> Hi Gang,
> 
> thanks for pointing to explanation of the feature.
> 
> What I am curious about is ... what were the real cases that you came 
> across prompted this change and how this change would help in that case.
> 
> Thanks,
> --Srini
> 
> 
> On 10/28/2015 09:44 PM, Gang He wrote:
>> Hello Srini,
>>
>> There is a doc about ocfs2 online file check.
>>
>> OCFS2 online file check
>> -----------------------
>>
>> This document will describe OCFS2 online file check feature.
>>
>> Introduction
>> ============
>> OCFS2 is often used in high-availaibility systems. However, OCFS2 usually
>> converts the filesystem to read-only on errors. This may not be necessary, 
> since
>> turning the filesystem read-only would affect other running processes as 
> well,
>> decreasing availability. Then, a mount option (errors=continue) was 
> introduced,
>> which would return the EIO to the calling process and terminate furhter
>> processing so that the filesystem is not corrupted further. So,the 
> filesystem is
>> not converted to read-only, and the problematic file's inode number is 
> reported
>> in the kernel log so that the user can try to check/fix this file via online
>> filecheck feature.
>>
>> Scope
>> =====
>> This effort is to check/fix small issues which may hinder day-to-day 
> operations
>> of a cluster filesystem by turning the filesystem read-only. The scope of
>> checking/fixing is at the file level, initially for regular files and 
> eventually
>> to all files (including system files) of the filesystem.
>>
>> In case of directory to file links is incorrect, the directory inode is
>> reported as erroneous.
>>
>> This feature is not suited for extravagant checks which involve dependency 
> of
>> other components of the filesystem, such as but not limited to, checking if 
> the
>> bits for file blocks in the allocation has been set. In case of such an 
> error,
>> the offline fsck should/would be recommended.
>>
>> Finally, such an operation/feature should not be automated lest the 
> filesystem
>> may end up with more damage than before the repair attempt. So, this has to
>> be performed using user interaction and consent.
>>
>> User interface
>> ==============
>> When there are errors in the OCFS2 filesystem, they are usually accompanied
>> by the inode number which caused the error. This inode number would be the
>> input to check/fix the file.
>>
>> There is a sysfs file for each OCFS2 file system mounting:
>>
>>    /sys/fs/ocfs2/<devname>/filecheck
>>
>> Here, <devname> indicates the name of OCFS2 volumn device which has been 
> already
>> mounted. The file above would accept inode numbers. This could be used to
>> communicate with kernel space, tell which file(inode number) will be checked 
> or
>> fixed. Currently, three operations are supported, which includes checking
>> inode, fixing inode and setting the size of result record history.
>>
>> 1. If you want to know what error exactly happened to <inode> before fixing, 
> do
>>
>>    # echo "CHECK <inode>" > /sys/fs/ocfs2/<devname>/filecheck
>>    # cat /sys/fs/ocfs2/<devname>/filecheck
>>
>> The output is like this:
>>    INO		TYPE		DONE		ERROR
>> 39502		0		1		GENERATION
>>
>> <INO> lists the inode numbers.
>> <TYPE> is what kind of operation you've done, 0 for inode check,1 for inode 
> fix.
>> <DONE> 	indicates whether the operation has been finished.
>> <ERROR> says what kind of errors was found. For the details, please refer to 
> the
>> file linux/fs/ocfs2/filecheck.h.
>>
>> 2. If you determine to fix this inode, do
>>
>>    # echo "FIX <inode>" > /sys/fs/ocfs2/<devname>/filecheck
>>    # cat /sys/fs/ocfs2/<devname>/filecheck
>>
>> The output is like this:
>>    INO		TYPE		DONE		ERROR
>> 39502		1		1		SUCCESS
>>
>> This time, the <ERROR> column indicates whether this fix is successful or not.
>>
>> 3. The record cache is used to store the history of check/fix result. Its
>> defalut size is 10, and can be adjust between the range of 10 ~ 100. You can
>> adjust the size like this:
>>
>>    # echo "SET <size>" > /sys/fs/ocfs2/<devname>/filecheck
>>
>> Fixing stuff
>> ============
>> On receivng the inode, the filesystem would read the inode and the
>> file metadata. In case of errors, the filesystem would fix the errors
>> and report the problems it fixed in the kernel log. As a precautionary 
> measure,
>> the inode must first be checked for errors before performing a final fix.
>>
>> The inode and the fix history will be maintained temporarily in a
>> small linked list buffer which would contain the last (N) inodes
>> fixed/checked, along with the logs of what errors were reported/fixed.
>>
>> Thanks
>> Gang
>>
>>
>>> Hi Gang,
>>>
>>> thank you for implementing this. I would like to understand this better
>>> on where and how it helps ... would you mind sharing couple
>>> examples(real scenarios).
>>>
>>> Thanks,
>>> --Srini
>>>
>>>
>>> On 10/27/2015 11:25 PM, Gang He wrote:
>>>> When there are errors in the ocfs2 filesystem,
>>>> they are usually accompanied by the inode number which caused the error.
>>>> This inode number would be the input to fixing the file.
>>>> One of these options could be considered:
>>>> A file in the sys filesytem which would accept inode numbers.
>>>> This could be used to communication back what has to be fixed or is fixed.
>>>> You could write:
>>>> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>>>> or
>>>> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>>>>
>>>> Compare with first version, I use strncasecmp instead of double strncmp
>>>> functions. Second, update the source file contribution vendor.
>>>>
>>>> Gang He (4):
>>>>     ocfs2: export ocfs2_kset for online file check
>>>>     ocfs2: sysfile interfaces for online file check
>>>>     ocfs2: create/remove sysfile for online file check
>>>>     ocfs2: check/fix inode block for online file check
>>>>
>>>>    fs/ocfs2/Makefile      |   3 +-
>>>>    fs/ocfs2/filecheck.c   | 566
>>> +++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    fs/ocfs2/filecheck.h   |  48 +++++
>>>>    fs/ocfs2/inode.c       | 196 ++++++++++++++++-
>>>>    fs/ocfs2/inode.h       |   3 +
>>>>    fs/ocfs2/ocfs2_trace.h |   2 +
>>>>    fs/ocfs2/stackglue.c   |   3 +-
>>>>    fs/ocfs2/stackglue.h   |   2 +
>>>>    fs/ocfs2/super.c       |   5 +
>>>>    9 files changed, 820 insertions(+), 8 deletions(-)
>>>>    create mode 100644 fs/ocfs2/filecheck.c
>>>>    create mode 100644 fs/ocfs2/filecheck.h
>>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-10-28  6:26   ` [Ocfs2-devel] " Gang He
@ 2015-11-03  7:12     ` Junxiao Bi
  -1 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  7:12 UTC (permalink / raw)
  To: Gang He, mfasheh, rgoldwyn; +Cc: linux-kernel, ocfs2-devel, akpm

Hi Gang,

This is not like a right patch.
First, online file check only checks inode's block number, valid flag,
fs generation value, and meta ecc. I never see a real corruption
happened only on this field, if these fields are corrupted, that means
something bad may happen on other place. So fix this field may not help
and even cause corruption more hard.
Second, the repair way is wrong. In
ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
match the ones in memory, the ones in memory are used to update the disk
fields. The question is how do you know these field in memory are
right(they may be the real corrupted ones)?

Thanks,
Junxiao.
On 10/28/2015 02:26 PM, Gang He wrote:
> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
> +			       struct buffer_head *bh)
> +{
> +	int rc;
> +	int changed = 0;
> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
> +
> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
> +	/* Can't fix invalid inode block */
> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
> +		return rc;
> +
> +	trace_ocfs2_filecheck_repair_inode_block(
> +		(unsigned long long)bh->b_blocknr);
> +
> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
> +		mlog(ML_ERROR,
> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
> +			(unsigned long long)bh->b_blocknr);
> +		return -OCFS2_FILECHECK_ERR_READONLY;
> +	}
> +
> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
> +		changed = 1;
> +		mlog(ML_ERROR,
> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
> +			(unsigned long long)bh->b_blocknr,
> +			(unsigned long long)le64_to_cpu(di->i_blkno));
> +	}
> +
> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
> +		changed = 1;
> +		mlog(ML_ERROR,
> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
> +			(unsigned long long)bh->b_blocknr);
> +	}
> +
> +	if (le32_to_cpu(di->i_fs_generation) !=
> +	    OCFS2_SB(sb)->fs_generation) {
> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
> +		changed = 1;
> +		mlog(ML_ERROR,
> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
> +			(unsigned long long)bh->b_blocknr,
> +			le32_to_cpu(di->i_fs_generation));
> +	}
> +
> +	if (changed ||
> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
> +		mark_buffer_dirty(bh);
> +		mlog(ML_ERROR,
> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
> +			(unsigned long long)bh->b_blocknr);
> +	}
> +
> +	return 0;
> +}


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-03  7:12     ` Junxiao Bi
  0 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  7:12 UTC (permalink / raw)
  To: Gang He, mfasheh, rgoldwyn; +Cc: linux-kernel, ocfs2-devel, akpm

Hi Gang,

This is not like a right patch.
First, online file check only checks inode's block number, valid flag,
fs generation value, and meta ecc. I never see a real corruption
happened only on this field, if these fields are corrupted, that means
something bad may happen on other place. So fix this field may not help
and even cause corruption more hard.
Second, the repair way is wrong. In
ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
match the ones in memory, the ones in memory are used to update the disk
fields. The question is how do you know these field in memory are
right(they may be the real corrupted ones)?

Thanks,
Junxiao.
On 10/28/2015 02:26 PM, Gang He wrote:
> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
> +			       struct buffer_head *bh)
> +{
> +	int rc;
> +	int changed = 0;
> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
> +
> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
> +	/* Can't fix invalid inode block */
> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
> +		return rc;
> +
> +	trace_ocfs2_filecheck_repair_inode_block(
> +		(unsigned long long)bh->b_blocknr);
> +
> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
> +		mlog(ML_ERROR,
> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
> +			(unsigned long long)bh->b_blocknr);
> +		return -OCFS2_FILECHECK_ERR_READONLY;
> +	}
> +
> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
> +		changed = 1;
> +		mlog(ML_ERROR,
> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
> +			(unsigned long long)bh->b_blocknr,
> +			(unsigned long long)le64_to_cpu(di->i_blkno));
> +	}
> +
> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
> +		changed = 1;
> +		mlog(ML_ERROR,
> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
> +			(unsigned long long)bh->b_blocknr);
> +	}
> +
> +	if (le32_to_cpu(di->i_fs_generation) !=
> +	    OCFS2_SB(sb)->fs_generation) {
> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
> +		changed = 1;
> +		mlog(ML_ERROR,
> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
> +			(unsigned long long)bh->b_blocknr,
> +			le32_to_cpu(di->i_fs_generation));
> +	}
> +
> +	if (changed ||
> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
> +		mark_buffer_dirty(bh);
> +		mlog(ML_ERROR,
> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
> +			(unsigned long long)bh->b_blocknr);
> +	}
> +
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-10-28  6:25   ` [Ocfs2-devel] " Gang He
@ 2015-11-03  7:20     ` Junxiao Bi
  -1 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  7:20 UTC (permalink / raw)
  To: Gang He, mfasheh, rgoldwyn; +Cc: linux-kernel, ocfs2-devel, akpm

Hi Gang,

I didn't see a need to add a sysfs file for the check and repair. This
leaves a hard problem for customer to decide. How they decide whether
they should repair the bad inode since this may cause corruption even
harder?
I think the error should be fixed by this feature automaticlly if repair
helps, of course this can be done only when error=continue is enabled or
add some mount option for it.

Thanks,
Junxiao.

On 10/28/2015 02:25 PM, Gang He wrote:
> Implement online file check sysfile interfaces, e.g.
> how to create the related sysfile according to device name,
> how to display/handle file check request from the sysfile.
> 
> Signed-off-by: Gang He <ghe@suse.com>
> ---
>  fs/ocfs2/Makefile    |   3 +-
>  fs/ocfs2/filecheck.c | 566 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/ocfs2/filecheck.h |  48 +++++
>  fs/ocfs2/inode.h     |   3 +
>  4 files changed, 619 insertions(+), 1 deletion(-)
>  create mode 100644 fs/ocfs2/filecheck.c
>  create mode 100644 fs/ocfs2/filecheck.h
> 
> diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
> index ce210d4..e27e652 100644
> --- a/fs/ocfs2/Makefile
> +++ b/fs/ocfs2/Makefile
> @@ -41,7 +41,8 @@ ocfs2-objs := \
>  	quota_local.o		\
>  	quota_global.o		\
>  	xattr.o			\
> -	acl.o
> +	acl.o	\
> +	filecheck.o
>  
>  ocfs2_stackglue-objs := stackglue.o
>  ocfs2_stack_o2cb-objs := stack_o2cb.o
> diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
> new file mode 100644
> index 0000000..f12ed1f
> --- /dev/null
> +++ b/fs/ocfs2/filecheck.c
> @@ -0,0 +1,566 @@
> +/* -*- mode: c; c-basic-offset: 8; -*-
> + * vim: noexpandtab sw=8 ts=8 sts=0:
> + *
> + * filecheck.c
> + *
> + * Code which implements online file check.
> + *
> + * Copyright (C) 2015 Novell.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License as published by the Free Software Foundation, version 2.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + */
> +
> +#include <linux/list.h>
> +#include <linux/spinlock.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/kmod.h>
> +#include <linux/fs.h>
> +#include <linux/kobject.h>
> +#include <linux/sysfs.h>
> +#include <linux/sysctl.h>
> +#include <cluster/masklog.h>
> +
> +#include "ocfs2.h"
> +#include "ocfs2_fs.h"
> +#include "stackglue.h"
> +#include "inode.h"
> +
> +#include "filecheck.h"
> +
> +
> +/* File check error strings,
> + * must correspond with error number in header file.
> + */
> +static const char * const ocfs2_filecheck_errs[] = {
> +	"SUCCESS",
> +	"FAILED",
> +	"INPROGRESS",
> +	"READONLY",
> +	"INVALIDINO",
> +	"BLOCKECC",
> +	"BLOCKNO",
> +	"VALIDFLAG",
> +	"GENERATION",
> +	"UNSUPPORTED"
> +};
> +
> +static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
> +static LIST_HEAD(ocfs2_filecheck_sysfs_list);
> +
> +struct ocfs2_filecheck {
> +	struct list_head fc_head;	/* File check entry list head */
> +	spinlock_t fc_lock;
> +	unsigned int fc_max;	/* Maximum number of entry in list */
> +	unsigned int fc_size;	/* Current entry count in list */
> +	unsigned int fc_done;	/* File check entries are done in list */
> +};
> +
> +struct ocfs2_filecheck_sysfs_entry {
> +	struct list_head fs_list;
> +	atomic_t fs_count;
> +	struct super_block *fs_sb;
> +	struct kset *fs_kset;
> +	struct ocfs2_filecheck *fs_fcheck;
> +};
> +
> +#define OCFS2_FILECHECK_MAXSIZE		100
> +#define OCFS2_FILECHECK_MINSIZE		10
> +
> +/* File check operation type */
> +enum {
> +	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
> +	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
> +	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
> +};
> +
> +struct ocfs2_filecheck_entry {
> +	struct list_head fe_list;
> +	unsigned long fe_ino;
> +	unsigned int fe_type;
> +	unsigned short fe_done:1;
> +	unsigned short fe_status:15;
> +};
> +
> +struct ocfs2_filecheck_args {
> +	unsigned int fa_type;
> +	union {
> +		unsigned long fa_ino;
> +		unsigned int fa_len;
> +	};
> +};
> +
> +static const char *
> +ocfs2_filecheck_error(int errno)
> +{
> +	if (!errno)
> +		return ocfs2_filecheck_errs[errno];
> +
> +	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
> +			errno > OCFS2_FILECHECK_ERR_END);
> +	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
> +}
> +
> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
> +					struct kobj_attribute *attr,
> +					char *buf);
> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
> +					struct kobj_attribute *attr,
> +					const char *buf, size_t count);
> +static struct kobj_attribute ocfs2_attr_filecheck =
> +					__ATTR(filecheck, S_IRUSR | S_IWUSR,
> +					ocfs2_filecheck_show,
> +					ocfs2_filecheck_store);
> +
> +static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
> +{
> +	schedule();
> +	return 0;
> +}
> +
> +static void
> +ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
> +{
> +	struct ocfs2_filecheck_entry *p;
> +
> +	if (!atomic_dec_and_test(&entry->fs_count))
> +		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
> +						TASK_UNINTERRUPTIBLE);
> +
> +	spin_lock(&entry->fs_fcheck->fc_lock);
> +	while (!list_empty(&entry->fs_fcheck->fc_head)) {
> +		p = list_first_entry(&entry->fs_fcheck->fc_head,
> +				struct ocfs2_filecheck_entry, fe_list);
> +		list_del(&p->fe_list);
> +		BUG_ON(!p->fe_done); /* To free a undone file check entry */
> +		kfree(p);
> +	}
> +	spin_unlock(&entry->fs_fcheck->fc_lock);
> +
> +	kset_unregister(entry->fs_kset);
> +	kfree(entry->fs_fcheck);
> +	kfree(entry);
> +}
> +
> +static void
> +ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
> +{
> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
> +	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +}
> +
> +static int ocfs2_filecheck_sysfs_del(const char *devname)
> +{
> +	struct ocfs2_filecheck_sysfs_entry *p;
> +
> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
> +		if (!strcmp(p->fs_sb->s_id, devname)) {
> +			list_del(&p->fs_list);
> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +			ocfs2_filecheck_sysfs_free(p);
> +			return 0;
> +		}
> +	}
> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +	return 1;
> +}
> +
> +static void
> +ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
> +{
> +	if (atomic_dec_and_test(&entry->fs_count))
> +		wake_up_atomic_t(&entry->fs_count);
> +}
> +
> +static struct ocfs2_filecheck_sysfs_entry *
> +ocfs2_filecheck_sysfs_get(const char *devname)
> +{
> +	struct ocfs2_filecheck_sysfs_entry *p = NULL;
> +
> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
> +		if (!strcmp(p->fs_sb->s_id, devname)) {
> +			atomic_inc(&p->fs_count);
> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +			return p;
> +		}
> +	}
> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +	return NULL;
> +}
> +
> +int ocfs2_filecheck_create_sysfs(struct super_block *sb)
> +{
> +	int ret = 0;
> +	struct kset *ocfs2_filecheck_kset = NULL;
> +	struct ocfs2_filecheck *fcheck = NULL;
> +	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
> +	struct attribute **attrs = NULL;
> +	struct attribute_group attrgp;
> +
> +	if (!ocfs2_kset)
> +		return -ENOMEM;
> +
> +	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
> +	if (!attrs) {
> +		ret = -ENOMEM;
> +		goto error;
> +	} else {
> +		attrs[0] = &ocfs2_attr_filecheck.attr;
> +		attrs[1] = NULL;
> +		memset(&attrgp, 0, sizeof(attrgp));
> +		attrgp.attrs = attrs;
> +	}
> +
> +	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
> +	if (!fcheck) {
> +		ret = -ENOMEM;
> +		goto error;
> +	} else {
> +		INIT_LIST_HEAD(&fcheck->fc_head);
> +		spin_lock_init(&fcheck->fc_lock);
> +		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
> +		fcheck->fc_size = 0;
> +		fcheck->fc_done = 0;
> +	}
> +
> +	if (strlen(sb->s_id) <= 0) {
> +		mlog(ML_ERROR,
> +		"Cannot get device basename when create filecheck sysfs\n");
> +		ret = -ENODEV;
> +		goto error;
> +	}
> +
> +	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
> +						&ocfs2_kset->kobj);
> +	if (!ocfs2_filecheck_kset) {
> +		ret = -ENOMEM;
> +		goto error;
> +	}
> +
> +	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
> +	if (ret)
> +		goto error;
> +
> +	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
> +	if (!entry) {
> +		ret = -ENOMEM;
> +		goto error;
> +	} else {
> +		atomic_set(&entry->fs_count, 1);
> +		entry->fs_sb = sb;
> +		entry->fs_kset = ocfs2_filecheck_kset;
> +		entry->fs_fcheck = fcheck;
> +		ocfs2_filecheck_sysfs_add(entry);
> +	}
> +
> +	kfree(attrs);
> +	return 0;
> +
> +error:
> +	kfree(attrs);
> +	kfree(entry);
> +	kfree(fcheck);
> +	kset_unregister(ocfs2_filecheck_kset);
> +	return ret;
> +}
> +
> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
> +{
> +	return ocfs2_filecheck_sysfs_del(sb->s_id);
> +}
> +
> +static int
> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
> +				unsigned int count);
> +static int
> +ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
> +				unsigned int len)
> +{
> +	int ret;
> +
> +	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
> +		return -EINVAL;
> +
> +	spin_lock(&ent->fs_fcheck->fc_lock);
> +	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
> +		mlog(ML_ERROR,
> +		"Cannot set online file check maximum entry number "
> +		"to %u due to too much pending entries(%u)\n",
> +		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
> +		ret = -EBUSY;
> +	} else {
> +		if (len < ent->fs_fcheck->fc_size)
> +			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
> +				ent->fs_fcheck->fc_size - len));
> +
> +		ent->fs_fcheck->fc_max = len;
> +		ret = 0;
> +	}
> +	spin_unlock(&ent->fs_fcheck->fc_lock);
> +
> +	return ret;
> +}
> +
> +#define OCFS2_FILECHECK_ARGS_LEN	32
> +static int
> +ocfs2_filecheck_args_get_long(const char *buf, size_t count,
> +				unsigned long *val)
> +{
> +	char buffer[OCFS2_FILECHECK_ARGS_LEN];
> +
> +	if (count < 1)
> +		return 1;
> +
> +	memcpy(buffer, buf, count);
> +	buffer[count] = '\0';
> +
> +	if (kstrtoul(buffer, 0, val))
> +		return 1;
> +
> +	return 0;
> +}
> +
> +static int
> +ocfs2_filecheck_args_parse(const char *buf, size_t count,
> +				struct ocfs2_filecheck_args *args)
> +{
> +	unsigned long val = 0;
> +
> +	/* too short/long args length */
> +	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
> +		return 1;
> +
> +	if (!strncasecmp(buf, "FIX ", 4)) {
> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
> +			return 1;
> +
> +		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
> +		args->fa_ino = val;
> +		return 0;
> +	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
> +		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
> +			return 1;
> +
> +		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
> +		args->fa_ino = val;
> +		return 0;
> +	} else if (!strncasecmp(buf, "SET ", 4)) {
> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
> +			return 1;
> +
> +		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
> +		args->fa_len = (unsigned int)val;
> +		return 0;
> +	} else { /* invalid args */
> +		return 1;
> +	}
> +}
> +
> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
> +					struct kobj_attribute *attr,
> +					char *buf)
> +{
> +
> +	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
> +	struct ocfs2_filecheck_entry *p;
> +	struct ocfs2_filecheck_sysfs_entry *ent;
> +
> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
> +	if (!ent) {
> +		mlog(ML_ERROR,
> +		"Cannot get the corresponding entry via device basename %s\n",
> +		kobj->name);
> +		return -ENODEV;
> +	}
> +
> +	spin_lock(&ent->fs_fcheck->fc_lock);
> +	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
> +	total += ret;
> +	remain -= ret;
> +
> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
> +		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
> +			p->fe_ino, p->fe_type, p->fe_done,
> +			ocfs2_filecheck_error(p->fe_status));
> +		if (ret < 0) {
> +			total = ret;
> +			break;
> +		}
> +		if (ret == remain) {
> +			/* snprintf() didn't fit */
> +			total = -E2BIG;
> +			break;
> +		}
> +		total += ret;
> +		remain -= ret;
> +	}
> +	spin_unlock(&ent->fs_fcheck->fc_lock);
> +
> +	ocfs2_filecheck_sysfs_put(ent);
> +	return total;
> +}
> +
> +static int
> +ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
> +{
> +	struct ocfs2_filecheck_entry *p;
> +
> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
> +		if (p->fe_done) {
> +			list_del(&p->fe_list);
> +			kfree(p);
> +			ent->fs_fcheck->fc_size--;
> +			ent->fs_fcheck->fc_done--;
> +			return 1;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
> +				unsigned int count)
> +{
> +	unsigned int i = 0;
> +	unsigned int ret = 0;
> +
> +	while (i++ < count) {
> +		if (ocfs2_filecheck_erase_entry(ent))
> +			ret++;
> +		else
> +			break;
> +	}
> +
> +	return (ret == count ? 1 : 0);
> +}
> +
> +static void
> +ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
> +				struct ocfs2_filecheck_entry *entry)
> +{
> +	entry->fe_done = 1;
> +	spin_lock(&ent->fs_fcheck->fc_lock);
> +	ent->fs_fcheck->fc_done++;
> +	spin_unlock(&ent->fs_fcheck->fc_lock);
> +}
> +
> +static unsigned short
> +ocfs2_filecheck_handle(struct super_block *sb,
> +				unsigned long ino, unsigned int flags)
> +{
> +	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
> +	struct inode *inode = NULL;
> +	int rc;
> +
> +	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
> +	if (IS_ERR(inode)) {
> +		rc = (int)(-(long)inode);
> +		if (rc >= OCFS2_FILECHECK_ERR_START &&
> +			rc < OCFS2_FILECHECK_ERR_END)
> +			ret = rc;
> +		else
> +			ret = OCFS2_FILECHECK_ERR_FAILED;
> +	} else
> +		iput(inode);
> +
> +	return ret;
> +}
> +
> +static void
> +ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
> +				struct ocfs2_filecheck_entry *entry)
> +{
> +	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
> +	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
> +	else
> +		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
> +
> +	ocfs2_filecheck_done_entry(ent, entry);
> +}
> +
> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
> +				struct kobj_attribute *attr,
> +				const char *buf, size_t count)
> +{
> +	struct ocfs2_filecheck_args args;
> +	struct ocfs2_filecheck_entry *entry = NULL;
> +	struct ocfs2_filecheck_sysfs_entry *ent;
> +	ssize_t ret = 0;
> +
> +	if (count == 0)
> +		return count;
> +
> +	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
> +		mlog(ML_ERROR, "Invalid arguments for online file check\n");
> +		return -EINVAL;
> +	}
> +
> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
> +	if (!ent) {
> +		mlog(ML_ERROR,
> +		"Cannot get the corresponding entry via device basename %s\n",
> +		kobj->name);
> +		return -ENODEV;
> +	}
> +
> +	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
> +		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
> +		ocfs2_filecheck_sysfs_put(ent);
> +		return (!ret ? count : ret);
> +	}
> +
> +	spin_lock(&ent->fs_fcheck->fc_lock);
> +	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
> +		(ent->fs_fcheck->fc_done == 0)) {
> +		mlog(ML_ERROR,
> +		"Online file check queue(%u) is full\n",
> +		ent->fs_fcheck->fc_max);
> +		ret = -EBUSY;
> +	} else {
> +		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
> +			(ent->fs_fcheck->fc_done > 0)) {
> +			/* Delete the oldest entry which was done,
> +			 * make sure the entry size in list does
> +			 * not exceed maximum value
> +			 */
> +			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
> +		}
> +
> +		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
> +		if (entry) {
> +			entry->fe_ino = args.fa_ino;
> +			entry->fe_type = args.fa_type;
> +			entry->fe_done = 0;
> +			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
> +			list_add_tail(&entry->fe_list,
> +					&ent->fs_fcheck->fc_head);
> +
> +			ent->fs_fcheck->fc_size++;
> +			ret = count;
> +		} else {
> +			ret = -ENOMEM;
> +		}
> +	}
> +	spin_unlock(&ent->fs_fcheck->fc_lock);
> +
> +	if (entry)
> +		ocfs2_filecheck_handle_entry(ent, entry);
> +
> +	ocfs2_filecheck_sysfs_put(ent);
> +	return ret;
> +}
> diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
> new file mode 100644
> index 0000000..5ec331b
> --- /dev/null
> +++ b/fs/ocfs2/filecheck.h
> @@ -0,0 +1,48 @@
> +/* -*- mode: c; c-basic-offset: 8; -*-
> + * vim: noexpandtab sw=8 ts=8 sts=0:
> + *
> + * filecheck.h
> + *
> + * Online file check.
> + *
> + * Copyright (C) 2015 Novell.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License as published by the Free Software Foundation, version 2.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + */
> +
> +
> +#ifndef FILECHECK_H
> +#define FILECHECK_H
> +
> +#include <linux/types.h>
> +#include <linux/list.h>
> +
> +
> +/* File check errno */
> +enum {
> +	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
> +	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
> +	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
> +	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
> +	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
> +	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
> +	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
> +	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
> +	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
> +	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
> +};
> +
> +#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
> +#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
> +
> +int ocfs2_filecheck_create_sysfs(struct super_block *sb);
> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
> +
> +#endif  /* FILECHECK_H */
> diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
> index 5e86b24..abd1018 100644
> --- a/fs/ocfs2/inode.h
> +++ b/fs/ocfs2/inode.h
> @@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
>  /* Flags for ocfs2_iget() */
>  #define OCFS2_FI_FLAG_SYSFILE		0x1
>  #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
> +#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
> +#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
> +
>  struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
>  struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned flags,
>  			 int sysfile_type);
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-03  7:20     ` Junxiao Bi
  0 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  7:20 UTC (permalink / raw)
  To: Gang He, mfasheh, rgoldwyn; +Cc: linux-kernel, ocfs2-devel, akpm

Hi Gang,

I didn't see a need to add a sysfs file for the check and repair. This
leaves a hard problem for customer to decide. How they decide whether
they should repair the bad inode since this may cause corruption even
harder?
I think the error should be fixed by this feature automaticlly if repair
helps, of course this can be done only when error=continue is enabled or
add some mount option for it.

Thanks,
Junxiao.

On 10/28/2015 02:25 PM, Gang He wrote:
> Implement online file check sysfile interfaces, e.g.
> how to create the related sysfile according to device name,
> how to display/handle file check request from the sysfile.
> 
> Signed-off-by: Gang He <ghe@suse.com>
> ---
>  fs/ocfs2/Makefile    |   3 +-
>  fs/ocfs2/filecheck.c | 566 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/ocfs2/filecheck.h |  48 +++++
>  fs/ocfs2/inode.h     |   3 +
>  4 files changed, 619 insertions(+), 1 deletion(-)
>  create mode 100644 fs/ocfs2/filecheck.c
>  create mode 100644 fs/ocfs2/filecheck.h
> 
> diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
> index ce210d4..e27e652 100644
> --- a/fs/ocfs2/Makefile
> +++ b/fs/ocfs2/Makefile
> @@ -41,7 +41,8 @@ ocfs2-objs := \
>  	quota_local.o		\
>  	quota_global.o		\
>  	xattr.o			\
> -	acl.o
> +	acl.o	\
> +	filecheck.o
>  
>  ocfs2_stackglue-objs := stackglue.o
>  ocfs2_stack_o2cb-objs := stack_o2cb.o
> diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
> new file mode 100644
> index 0000000..f12ed1f
> --- /dev/null
> +++ b/fs/ocfs2/filecheck.c
> @@ -0,0 +1,566 @@
> +/* -*- mode: c; c-basic-offset: 8; -*-
> + * vim: noexpandtab sw=8 ts=8 sts=0:
> + *
> + * filecheck.c
> + *
> + * Code which implements online file check.
> + *
> + * Copyright (C) 2015 Novell.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License as published by the Free Software Foundation, version 2.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + */
> +
> +#include <linux/list.h>
> +#include <linux/spinlock.h>
> +#include <linux/module.h>
> +#include <linux/slab.h>
> +#include <linux/kmod.h>
> +#include <linux/fs.h>
> +#include <linux/kobject.h>
> +#include <linux/sysfs.h>
> +#include <linux/sysctl.h>
> +#include <cluster/masklog.h>
> +
> +#include "ocfs2.h"
> +#include "ocfs2_fs.h"
> +#include "stackglue.h"
> +#include "inode.h"
> +
> +#include "filecheck.h"
> +
> +
> +/* File check error strings,
> + * must correspond with error number in header file.
> + */
> +static const char * const ocfs2_filecheck_errs[] = {
> +	"SUCCESS",
> +	"FAILED",
> +	"INPROGRESS",
> +	"READONLY",
> +	"INVALIDINO",
> +	"BLOCKECC",
> +	"BLOCKNO",
> +	"VALIDFLAG",
> +	"GENERATION",
> +	"UNSUPPORTED"
> +};
> +
> +static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
> +static LIST_HEAD(ocfs2_filecheck_sysfs_list);
> +
> +struct ocfs2_filecheck {
> +	struct list_head fc_head;	/* File check entry list head */
> +	spinlock_t fc_lock;
> +	unsigned int fc_max;	/* Maximum number of entry in list */
> +	unsigned int fc_size;	/* Current entry count in list */
> +	unsigned int fc_done;	/* File check entries are done in list */
> +};
> +
> +struct ocfs2_filecheck_sysfs_entry {
> +	struct list_head fs_list;
> +	atomic_t fs_count;
> +	struct super_block *fs_sb;
> +	struct kset *fs_kset;
> +	struct ocfs2_filecheck *fs_fcheck;
> +};
> +
> +#define OCFS2_FILECHECK_MAXSIZE		100
> +#define OCFS2_FILECHECK_MINSIZE		10
> +
> +/* File check operation type */
> +enum {
> +	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
> +	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
> +	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
> +};
> +
> +struct ocfs2_filecheck_entry {
> +	struct list_head fe_list;
> +	unsigned long fe_ino;
> +	unsigned int fe_type;
> +	unsigned short fe_done:1;
> +	unsigned short fe_status:15;
> +};
> +
> +struct ocfs2_filecheck_args {
> +	unsigned int fa_type;
> +	union {
> +		unsigned long fa_ino;
> +		unsigned int fa_len;
> +	};
> +};
> +
> +static const char *
> +ocfs2_filecheck_error(int errno)
> +{
> +	if (!errno)
> +		return ocfs2_filecheck_errs[errno];
> +
> +	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
> +			errno > OCFS2_FILECHECK_ERR_END);
> +	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
> +}
> +
> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
> +					struct kobj_attribute *attr,
> +					char *buf);
> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
> +					struct kobj_attribute *attr,
> +					const char *buf, size_t count);
> +static struct kobj_attribute ocfs2_attr_filecheck =
> +					__ATTR(filecheck, S_IRUSR | S_IWUSR,
> +					ocfs2_filecheck_show,
> +					ocfs2_filecheck_store);
> +
> +static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
> +{
> +	schedule();
> +	return 0;
> +}
> +
> +static void
> +ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
> +{
> +	struct ocfs2_filecheck_entry *p;
> +
> +	if (!atomic_dec_and_test(&entry->fs_count))
> +		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
> +						TASK_UNINTERRUPTIBLE);
> +
> +	spin_lock(&entry->fs_fcheck->fc_lock);
> +	while (!list_empty(&entry->fs_fcheck->fc_head)) {
> +		p = list_first_entry(&entry->fs_fcheck->fc_head,
> +				struct ocfs2_filecheck_entry, fe_list);
> +		list_del(&p->fe_list);
> +		BUG_ON(!p->fe_done); /* To free a undone file check entry */
> +		kfree(p);
> +	}
> +	spin_unlock(&entry->fs_fcheck->fc_lock);
> +
> +	kset_unregister(entry->fs_kset);
> +	kfree(entry->fs_fcheck);
> +	kfree(entry);
> +}
> +
> +static void
> +ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
> +{
> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
> +	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +}
> +
> +static int ocfs2_filecheck_sysfs_del(const char *devname)
> +{
> +	struct ocfs2_filecheck_sysfs_entry *p;
> +
> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
> +		if (!strcmp(p->fs_sb->s_id, devname)) {
> +			list_del(&p->fs_list);
> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +			ocfs2_filecheck_sysfs_free(p);
> +			return 0;
> +		}
> +	}
> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +	return 1;
> +}
> +
> +static void
> +ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
> +{
> +	if (atomic_dec_and_test(&entry->fs_count))
> +		wake_up_atomic_t(&entry->fs_count);
> +}
> +
> +static struct ocfs2_filecheck_sysfs_entry *
> +ocfs2_filecheck_sysfs_get(const char *devname)
> +{
> +	struct ocfs2_filecheck_sysfs_entry *p = NULL;
> +
> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
> +		if (!strcmp(p->fs_sb->s_id, devname)) {
> +			atomic_inc(&p->fs_count);
> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +			return p;
> +		}
> +	}
> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
> +	return NULL;
> +}
> +
> +int ocfs2_filecheck_create_sysfs(struct super_block *sb)
> +{
> +	int ret = 0;
> +	struct kset *ocfs2_filecheck_kset = NULL;
> +	struct ocfs2_filecheck *fcheck = NULL;
> +	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
> +	struct attribute **attrs = NULL;
> +	struct attribute_group attrgp;
> +
> +	if (!ocfs2_kset)
> +		return -ENOMEM;
> +
> +	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
> +	if (!attrs) {
> +		ret = -ENOMEM;
> +		goto error;
> +	} else {
> +		attrs[0] = &ocfs2_attr_filecheck.attr;
> +		attrs[1] = NULL;
> +		memset(&attrgp, 0, sizeof(attrgp));
> +		attrgp.attrs = attrs;
> +	}
> +
> +	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
> +	if (!fcheck) {
> +		ret = -ENOMEM;
> +		goto error;
> +	} else {
> +		INIT_LIST_HEAD(&fcheck->fc_head);
> +		spin_lock_init(&fcheck->fc_lock);
> +		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
> +		fcheck->fc_size = 0;
> +		fcheck->fc_done = 0;
> +	}
> +
> +	if (strlen(sb->s_id) <= 0) {
> +		mlog(ML_ERROR,
> +		"Cannot get device basename when create filecheck sysfs\n");
> +		ret = -ENODEV;
> +		goto error;
> +	}
> +
> +	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
> +						&ocfs2_kset->kobj);
> +	if (!ocfs2_filecheck_kset) {
> +		ret = -ENOMEM;
> +		goto error;
> +	}
> +
> +	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
> +	if (ret)
> +		goto error;
> +
> +	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
> +	if (!entry) {
> +		ret = -ENOMEM;
> +		goto error;
> +	} else {
> +		atomic_set(&entry->fs_count, 1);
> +		entry->fs_sb = sb;
> +		entry->fs_kset = ocfs2_filecheck_kset;
> +		entry->fs_fcheck = fcheck;
> +		ocfs2_filecheck_sysfs_add(entry);
> +	}
> +
> +	kfree(attrs);
> +	return 0;
> +
> +error:
> +	kfree(attrs);
> +	kfree(entry);
> +	kfree(fcheck);
> +	kset_unregister(ocfs2_filecheck_kset);
> +	return ret;
> +}
> +
> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
> +{
> +	return ocfs2_filecheck_sysfs_del(sb->s_id);
> +}
> +
> +static int
> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
> +				unsigned int count);
> +static int
> +ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
> +				unsigned int len)
> +{
> +	int ret;
> +
> +	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
> +		return -EINVAL;
> +
> +	spin_lock(&ent->fs_fcheck->fc_lock);
> +	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
> +		mlog(ML_ERROR,
> +		"Cannot set online file check maximum entry number "
> +		"to %u due to too much pending entries(%u)\n",
> +		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
> +		ret = -EBUSY;
> +	} else {
> +		if (len < ent->fs_fcheck->fc_size)
> +			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
> +				ent->fs_fcheck->fc_size - len));
> +
> +		ent->fs_fcheck->fc_max = len;
> +		ret = 0;
> +	}
> +	spin_unlock(&ent->fs_fcheck->fc_lock);
> +
> +	return ret;
> +}
> +
> +#define OCFS2_FILECHECK_ARGS_LEN	32
> +static int
> +ocfs2_filecheck_args_get_long(const char *buf, size_t count,
> +				unsigned long *val)
> +{
> +	char buffer[OCFS2_FILECHECK_ARGS_LEN];
> +
> +	if (count < 1)
> +		return 1;
> +
> +	memcpy(buffer, buf, count);
> +	buffer[count] = '\0';
> +
> +	if (kstrtoul(buffer, 0, val))
> +		return 1;
> +
> +	return 0;
> +}
> +
> +static int
> +ocfs2_filecheck_args_parse(const char *buf, size_t count,
> +				struct ocfs2_filecheck_args *args)
> +{
> +	unsigned long val = 0;
> +
> +	/* too short/long args length */
> +	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
> +		return 1;
> +
> +	if (!strncasecmp(buf, "FIX ", 4)) {
> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
> +			return 1;
> +
> +		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
> +		args->fa_ino = val;
> +		return 0;
> +	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
> +		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
> +			return 1;
> +
> +		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
> +		args->fa_ino = val;
> +		return 0;
> +	} else if (!strncasecmp(buf, "SET ", 4)) {
> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
> +			return 1;
> +
> +		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
> +		args->fa_len = (unsigned int)val;
> +		return 0;
> +	} else { /* invalid args */
> +		return 1;
> +	}
> +}
> +
> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
> +					struct kobj_attribute *attr,
> +					char *buf)
> +{
> +
> +	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
> +	struct ocfs2_filecheck_entry *p;
> +	struct ocfs2_filecheck_sysfs_entry *ent;
> +
> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
> +	if (!ent) {
> +		mlog(ML_ERROR,
> +		"Cannot get the corresponding entry via device basename %s\n",
> +		kobj->name);
> +		return -ENODEV;
> +	}
> +
> +	spin_lock(&ent->fs_fcheck->fc_lock);
> +	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
> +	total += ret;
> +	remain -= ret;
> +
> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
> +		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
> +			p->fe_ino, p->fe_type, p->fe_done,
> +			ocfs2_filecheck_error(p->fe_status));
> +		if (ret < 0) {
> +			total = ret;
> +			break;
> +		}
> +		if (ret == remain) {
> +			/* snprintf() didn't fit */
> +			total = -E2BIG;
> +			break;
> +		}
> +		total += ret;
> +		remain -= ret;
> +	}
> +	spin_unlock(&ent->fs_fcheck->fc_lock);
> +
> +	ocfs2_filecheck_sysfs_put(ent);
> +	return total;
> +}
> +
> +static int
> +ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
> +{
> +	struct ocfs2_filecheck_entry *p;
> +
> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
> +		if (p->fe_done) {
> +			list_del(&p->fe_list);
> +			kfree(p);
> +			ent->fs_fcheck->fc_size--;
> +			ent->fs_fcheck->fc_done--;
> +			return 1;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
> +				unsigned int count)
> +{
> +	unsigned int i = 0;
> +	unsigned int ret = 0;
> +
> +	while (i++ < count) {
> +		if (ocfs2_filecheck_erase_entry(ent))
> +			ret++;
> +		else
> +			break;
> +	}
> +
> +	return (ret == count ? 1 : 0);
> +}
> +
> +static void
> +ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
> +				struct ocfs2_filecheck_entry *entry)
> +{
> +	entry->fe_done = 1;
> +	spin_lock(&ent->fs_fcheck->fc_lock);
> +	ent->fs_fcheck->fc_done++;
> +	spin_unlock(&ent->fs_fcheck->fc_lock);
> +}
> +
> +static unsigned short
> +ocfs2_filecheck_handle(struct super_block *sb,
> +				unsigned long ino, unsigned int flags)
> +{
> +	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
> +	struct inode *inode = NULL;
> +	int rc;
> +
> +	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
> +	if (IS_ERR(inode)) {
> +		rc = (int)(-(long)inode);
> +		if (rc >= OCFS2_FILECHECK_ERR_START &&
> +			rc < OCFS2_FILECHECK_ERR_END)
> +			ret = rc;
> +		else
> +			ret = OCFS2_FILECHECK_ERR_FAILED;
> +	} else
> +		iput(inode);
> +
> +	return ret;
> +}
> +
> +static void
> +ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
> +				struct ocfs2_filecheck_entry *entry)
> +{
> +	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
> +	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
> +	else
> +		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
> +
> +	ocfs2_filecheck_done_entry(ent, entry);
> +}
> +
> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
> +				struct kobj_attribute *attr,
> +				const char *buf, size_t count)
> +{
> +	struct ocfs2_filecheck_args args;
> +	struct ocfs2_filecheck_entry *entry = NULL;
> +	struct ocfs2_filecheck_sysfs_entry *ent;
> +	ssize_t ret = 0;
> +
> +	if (count == 0)
> +		return count;
> +
> +	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
> +		mlog(ML_ERROR, "Invalid arguments for online file check\n");
> +		return -EINVAL;
> +	}
> +
> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
> +	if (!ent) {
> +		mlog(ML_ERROR,
> +		"Cannot get the corresponding entry via device basename %s\n",
> +		kobj->name);
> +		return -ENODEV;
> +	}
> +
> +	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
> +		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
> +		ocfs2_filecheck_sysfs_put(ent);
> +		return (!ret ? count : ret);
> +	}
> +
> +	spin_lock(&ent->fs_fcheck->fc_lock);
> +	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
> +		(ent->fs_fcheck->fc_done == 0)) {
> +		mlog(ML_ERROR,
> +		"Online file check queue(%u) is full\n",
> +		ent->fs_fcheck->fc_max);
> +		ret = -EBUSY;
> +	} else {
> +		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
> +			(ent->fs_fcheck->fc_done > 0)) {
> +			/* Delete the oldest entry which was done,
> +			 * make sure the entry size in list does
> +			 * not exceed maximum value
> +			 */
> +			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
> +		}
> +
> +		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
> +		if (entry) {
> +			entry->fe_ino = args.fa_ino;
> +			entry->fe_type = args.fa_type;
> +			entry->fe_done = 0;
> +			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
> +			list_add_tail(&entry->fe_list,
> +					&ent->fs_fcheck->fc_head);
> +
> +			ent->fs_fcheck->fc_size++;
> +			ret = count;
> +		} else {
> +			ret = -ENOMEM;
> +		}
> +	}
> +	spin_unlock(&ent->fs_fcheck->fc_lock);
> +
> +	if (entry)
> +		ocfs2_filecheck_handle_entry(ent, entry);
> +
> +	ocfs2_filecheck_sysfs_put(ent);
> +	return ret;
> +}
> diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
> new file mode 100644
> index 0000000..5ec331b
> --- /dev/null
> +++ b/fs/ocfs2/filecheck.h
> @@ -0,0 +1,48 @@
> +/* -*- mode: c; c-basic-offset: 8; -*-
> + * vim: noexpandtab sw=8 ts=8 sts=0:
> + *
> + * filecheck.h
> + *
> + * Online file check.
> + *
> + * Copyright (C) 2015 Novell.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License as published by the Free Software Foundation, version 2.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + */
> +
> +
> +#ifndef FILECHECK_H
> +#define FILECHECK_H
> +
> +#include <linux/types.h>
> +#include <linux/list.h>
> +
> +
> +/* File check errno */
> +enum {
> +	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
> +	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
> +	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
> +	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
> +	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
> +	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
> +	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
> +	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
> +	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
> +	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
> +};
> +
> +#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
> +#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
> +
> +int ocfs2_filecheck_create_sysfs(struct super_block *sb);
> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
> +
> +#endif  /* FILECHECK_H */
> diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
> index 5e86b24..abd1018 100644
> --- a/fs/ocfs2/inode.h
> +++ b/fs/ocfs2/inode.h
> @@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
>  /* Flags for ocfs2_iget() */
>  #define OCFS2_FI_FLAG_SYSFILE		0x1
>  #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
> +#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
> +#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
> +
>  struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
>  struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned flags,
>  			 int sysfile_type);
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-03  7:20     ` [Ocfs2-devel] " Junxiao Bi
@ 2015-11-03  7:54       ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  7:54 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

Hi Junxiao,

Thank for your reviewing.
Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
Why?
1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
3) in the future, if this feature is well proved, we can add a mount option to make this automatically fix enabled.


Thanks
Gang
   


>>> 
> Hi Gang,
> 
> I didn't see a need to add a sysfs file for the check and repair. This
> leaves a hard problem for customer to decide. How they decide whether
> they should repair the bad inode since this may cause corruption even
> harder?
> I think the error should be fixed by this feature automaticlly if repair
> helps, of course this can be done only when error=continue is enabled or
> add some mount option for it.
> 
> Thanks,
> Junxiao.
> 
> On 10/28/2015 02:25 PM, Gang He wrote:
>> Implement online file check sysfile interfaces, e.g.
>> how to create the related sysfile according to device name,
>> how to display/handle file check request from the sysfile.
>> 
>> Signed-off-by: Gang He <ghe@suse.com>
>> ---
>>  fs/ocfs2/Makefile    |   3 +-
>>  fs/ocfs2/filecheck.c | 566 
> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  fs/ocfs2/filecheck.h |  48 +++++
>>  fs/ocfs2/inode.h     |   3 +
>>  4 files changed, 619 insertions(+), 1 deletion(-)
>>  create mode 100644 fs/ocfs2/filecheck.c
>>  create mode 100644 fs/ocfs2/filecheck.h
>> 
>> diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
>> index ce210d4..e27e652 100644
>> --- a/fs/ocfs2/Makefile
>> +++ b/fs/ocfs2/Makefile
>> @@ -41,7 +41,8 @@ ocfs2-objs := \
>>  	quota_local.o		\
>>  	quota_global.o		\
>>  	xattr.o			\
>> -	acl.o
>> +	acl.o	\
>> +	filecheck.o
>>  
>>  ocfs2_stackglue-objs := stackglue.o
>>  ocfs2_stack_o2cb-objs := stack_o2cb.o
>> diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
>> new file mode 100644
>> index 0000000..f12ed1f
>> --- /dev/null
>> +++ b/fs/ocfs2/filecheck.c
>> @@ -0,0 +1,566 @@
>> +/* -*- mode: c; c-basic-offset: 8; -*-
>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>> + *
>> + * filecheck.c
>> + *
>> + * Code which implements online file check.
>> + *
>> + * Copyright (C) 2015 Novell.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public
>> + * License as published by the Free Software Foundation, version 2.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + */
>> +
>> +#include <linux/list.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/module.h>
>> +#include <linux/slab.h>
>> +#include <linux/kmod.h>
>> +#include <linux/fs.h>
>> +#include <linux/kobject.h>
>> +#include <linux/sysfs.h>
>> +#include <linux/sysctl.h>
>> +#include <cluster/masklog.h>
>> +
>> +#include "ocfs2.h"
>> +#include "ocfs2_fs.h"
>> +#include "stackglue.h"
>> +#include "inode.h"
>> +
>> +#include "filecheck.h"
>> +
>> +
>> +/* File check error strings,
>> + * must correspond with error number in header file.
>> + */
>> +static const char * const ocfs2_filecheck_errs[] = {
>> +	"SUCCESS",
>> +	"FAILED",
>> +	"INPROGRESS",
>> +	"READONLY",
>> +	"INVALIDINO",
>> +	"BLOCKECC",
>> +	"BLOCKNO",
>> +	"VALIDFLAG",
>> +	"GENERATION",
>> +	"UNSUPPORTED"
>> +};
>> +
>> +static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
>> +static LIST_HEAD(ocfs2_filecheck_sysfs_list);
>> +
>> +struct ocfs2_filecheck {
>> +	struct list_head fc_head;	/* File check entry list head */
>> +	spinlock_t fc_lock;
>> +	unsigned int fc_max;	/* Maximum number of entry in list */
>> +	unsigned int fc_size;	/* Current entry count in list */
>> +	unsigned int fc_done;	/* File check entries are done in list */
>> +};
>> +
>> +struct ocfs2_filecheck_sysfs_entry {
>> +	struct list_head fs_list;
>> +	atomic_t fs_count;
>> +	struct super_block *fs_sb;
>> +	struct kset *fs_kset;
>> +	struct ocfs2_filecheck *fs_fcheck;
>> +};
>> +
>> +#define OCFS2_FILECHECK_MAXSIZE		100
>> +#define OCFS2_FILECHECK_MINSIZE		10
>> +
>> +/* File check operation type */
>> +enum {
>> +	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
>> +	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
>> +	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
>> +};
>> +
>> +struct ocfs2_filecheck_entry {
>> +	struct list_head fe_list;
>> +	unsigned long fe_ino;
>> +	unsigned int fe_type;
>> +	unsigned short fe_done:1;
>> +	unsigned short fe_status:15;
>> +};
>> +
>> +struct ocfs2_filecheck_args {
>> +	unsigned int fa_type;
>> +	union {
>> +		unsigned long fa_ino;
>> +		unsigned int fa_len;
>> +	};
>> +};
>> +
>> +static const char *
>> +ocfs2_filecheck_error(int errno)
>> +{
>> +	if (!errno)
>> +		return ocfs2_filecheck_errs[errno];
>> +
>> +	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
>> +			errno > OCFS2_FILECHECK_ERR_END);
>> +	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
>> +}
>> +
>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>> +					struct kobj_attribute *attr,
>> +					char *buf);
>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>> +					struct kobj_attribute *attr,
>> +					const char *buf, size_t count);
>> +static struct kobj_attribute ocfs2_attr_filecheck =
>> +					__ATTR(filecheck, S_IRUSR | S_IWUSR,
>> +					ocfs2_filecheck_show,
>> +					ocfs2_filecheck_store);
>> +
>> +static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
>> +{
>> +	schedule();
>> +	return 0;
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
>> +{
>> +	struct ocfs2_filecheck_entry *p;
>> +
>> +	if (!atomic_dec_and_test(&entry->fs_count))
>> +		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
>> +						TASK_UNINTERRUPTIBLE);
>> +
>> +	spin_lock(&entry->fs_fcheck->fc_lock);
>> +	while (!list_empty(&entry->fs_fcheck->fc_head)) {
>> +		p = list_first_entry(&entry->fs_fcheck->fc_head,
>> +				struct ocfs2_filecheck_entry, fe_list);
>> +		list_del(&p->fe_list);
>> +		BUG_ON(!p->fe_done); /* To free a undone file check entry */
>> +		kfree(p);
>> +	}
>> +	spin_unlock(&entry->fs_fcheck->fc_lock);
>> +
>> +	kset_unregister(entry->fs_kset);
>> +	kfree(entry->fs_fcheck);
>> +	kfree(entry);
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
>> +{
>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>> +	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +}
>> +
>> +static int ocfs2_filecheck_sysfs_del(const char *devname)
>> +{
>> +	struct ocfs2_filecheck_sysfs_entry *p;
>> +
>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>> +			list_del(&p->fs_list);
>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +			ocfs2_filecheck_sysfs_free(p);
>> +			return 0;
>> +		}
>> +	}
>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +	return 1;
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
>> +{
>> +	if (atomic_dec_and_test(&entry->fs_count))
>> +		wake_up_atomic_t(&entry->fs_count);
>> +}
>> +
>> +static struct ocfs2_filecheck_sysfs_entry *
>> +ocfs2_filecheck_sysfs_get(const char *devname)
>> +{
>> +	struct ocfs2_filecheck_sysfs_entry *p = NULL;
>> +
>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>> +			atomic_inc(&p->fs_count);
>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +			return p;
>> +		}
>> +	}
>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +	return NULL;
>> +}
>> +
>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb)
>> +{
>> +	int ret = 0;
>> +	struct kset *ocfs2_filecheck_kset = NULL;
>> +	struct ocfs2_filecheck *fcheck = NULL;
>> +	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
>> +	struct attribute **attrs = NULL;
>> +	struct attribute_group attrgp;
>> +
>> +	if (!ocfs2_kset)
>> +		return -ENOMEM;
>> +
>> +	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
>> +	if (!attrs) {
>> +		ret = -ENOMEM;
>> +		goto error;
>> +	} else {
>> +		attrs[0] = &ocfs2_attr_filecheck.attr;
>> +		attrs[1] = NULL;
>> +		memset(&attrgp, 0, sizeof(attrgp));
>> +		attrgp.attrs = attrs;
>> +	}
>> +
>> +	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
>> +	if (!fcheck) {
>> +		ret = -ENOMEM;
>> +		goto error;
>> +	} else {
>> +		INIT_LIST_HEAD(&fcheck->fc_head);
>> +		spin_lock_init(&fcheck->fc_lock);
>> +		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
>> +		fcheck->fc_size = 0;
>> +		fcheck->fc_done = 0;
>> +	}
>> +
>> +	if (strlen(sb->s_id) <= 0) {
>> +		mlog(ML_ERROR,
>> +		"Cannot get device basename when create filecheck sysfs\n");
>> +		ret = -ENODEV;
>> +		goto error;
>> +	}
>> +
>> +	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
>> +						&ocfs2_kset->kobj);
>> +	if (!ocfs2_filecheck_kset) {
>> +		ret = -ENOMEM;
>> +		goto error;
>> +	}
>> +
>> +	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
>> +	if (ret)
>> +		goto error;
>> +
>> +	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
>> +	if (!entry) {
>> +		ret = -ENOMEM;
>> +		goto error;
>> +	} else {
>> +		atomic_set(&entry->fs_count, 1);
>> +		entry->fs_sb = sb;
>> +		entry->fs_kset = ocfs2_filecheck_kset;
>> +		entry->fs_fcheck = fcheck;
>> +		ocfs2_filecheck_sysfs_add(entry);
>> +	}
>> +
>> +	kfree(attrs);
>> +	return 0;
>> +
>> +error:
>> +	kfree(attrs);
>> +	kfree(entry);
>> +	kfree(fcheck);
>> +	kset_unregister(ocfs2_filecheck_kset);
>> +	return ret;
>> +}
>> +
>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
>> +{
>> +	return ocfs2_filecheck_sysfs_del(sb->s_id);
>> +}
>> +
>> +static int
>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				unsigned int count);
>> +static int
>> +ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				unsigned int len)
>> +{
>> +	int ret;
>> +
>> +	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
>> +		return -EINVAL;
>> +
>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>> +	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
>> +		mlog(ML_ERROR,
>> +		"Cannot set online file check maximum entry number "
>> +		"to %u due to too much pending entries(%u)\n",
>> +		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
>> +		ret = -EBUSY;
>> +	} else {
>> +		if (len < ent->fs_fcheck->fc_size)
>> +			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
>> +				ent->fs_fcheck->fc_size - len));
>> +
>> +		ent->fs_fcheck->fc_max = len;
>> +		ret = 0;
>> +	}
>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>> +
>> +	return ret;
>> +}
>> +
>> +#define OCFS2_FILECHECK_ARGS_LEN	32
>> +static int
>> +ocfs2_filecheck_args_get_long(const char *buf, size_t count,
>> +				unsigned long *val)
>> +{
>> +	char buffer[OCFS2_FILECHECK_ARGS_LEN];
>> +
>> +	if (count < 1)
>> +		return 1;
>> +
>> +	memcpy(buffer, buf, count);
>> +	buffer[count] = '\0';
>> +
>> +	if (kstrtoul(buffer, 0, val))
>> +		return 1;
>> +
>> +	return 0;
>> +}
>> +
>> +static int
>> +ocfs2_filecheck_args_parse(const char *buf, size_t count,
>> +				struct ocfs2_filecheck_args *args)
>> +{
>> +	unsigned long val = 0;
>> +
>> +	/* too short/long args length */
>> +	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
>> +		return 1;
>> +
>> +	if (!strncasecmp(buf, "FIX ", 4)) {
>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>> +			return 1;
>> +
>> +		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
>> +		args->fa_ino = val;
>> +		return 0;
>> +	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
>> +		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
>> +			return 1;
>> +
>> +		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
>> +		args->fa_ino = val;
>> +		return 0;
>> +	} else if (!strncasecmp(buf, "SET ", 4)) {
>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>> +			return 1;
>> +
>> +		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
>> +		args->fa_len = (unsigned int)val;
>> +		return 0;
>> +	} else { /* invalid args */
>> +		return 1;
>> +	}
>> +}
>> +
>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>> +					struct kobj_attribute *attr,
>> +					char *buf)
>> +{
>> +
>> +	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
>> +	struct ocfs2_filecheck_entry *p;
>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>> +
>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>> +	if (!ent) {
>> +		mlog(ML_ERROR,
>> +		"Cannot get the corresponding entry via device basename %s\n",
>> +		kobj->name);
>> +		return -ENODEV;
>> +	}
>> +
>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>> +	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
>> +	total += ret;
>> +	remain -= ret;
>> +
>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>> +		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
>> +			p->fe_ino, p->fe_type, p->fe_done,
>> +			ocfs2_filecheck_error(p->fe_status));
>> +		if (ret < 0) {
>> +			total = ret;
>> +			break;
>> +		}
>> +		if (ret == remain) {
>> +			/* snprintf() didn't fit */
>> +			total = -E2BIG;
>> +			break;
>> +		}
>> +		total += ret;
>> +		remain -= ret;
>> +	}
>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>> +
>> +	ocfs2_filecheck_sysfs_put(ent);
>> +	return total;
>> +}
>> +
>> +static int
>> +ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
>> +{
>> +	struct ocfs2_filecheck_entry *p;
>> +
>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>> +		if (p->fe_done) {
>> +			list_del(&p->fe_list);
>> +			kfree(p);
>> +			ent->fs_fcheck->fc_size--;
>> +			ent->fs_fcheck->fc_done--;
>> +			return 1;
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int
>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				unsigned int count)
>> +{
>> +	unsigned int i = 0;
>> +	unsigned int ret = 0;
>> +
>> +	while (i++ < count) {
>> +		if (ocfs2_filecheck_erase_entry(ent))
>> +			ret++;
>> +		else
>> +			break;
>> +	}
>> +
>> +	return (ret == count ? 1 : 0);
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				struct ocfs2_filecheck_entry *entry)
>> +{
>> +	entry->fe_done = 1;
>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>> +	ent->fs_fcheck->fc_done++;
>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>> +}
>> +
>> +static unsigned short
>> +ocfs2_filecheck_handle(struct super_block *sb,
>> +				unsigned long ino, unsigned int flags)
>> +{
>> +	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
>> +	struct inode *inode = NULL;
>> +	int rc;
>> +
>> +	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
>> +	if (IS_ERR(inode)) {
>> +		rc = (int)(-(long)inode);
>> +		if (rc >= OCFS2_FILECHECK_ERR_START &&
>> +			rc < OCFS2_FILECHECK_ERR_END)
>> +			ret = rc;
>> +		else
>> +			ret = OCFS2_FILECHECK_ERR_FAILED;
>> +	} else
>> +		iput(inode);
>> +
>> +	return ret;
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				struct ocfs2_filecheck_entry *entry)
>> +{
>> +	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
>> +	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
>> +	else
>> +		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
>> +
>> +	ocfs2_filecheck_done_entry(ent, entry);
>> +}
>> +
>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>> +				struct kobj_attribute *attr,
>> +				const char *buf, size_t count)
>> +{
>> +	struct ocfs2_filecheck_args args;
>> +	struct ocfs2_filecheck_entry *entry = NULL;
>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>> +	ssize_t ret = 0;
>> +
>> +	if (count == 0)
>> +		return count;
>> +
>> +	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
>> +		mlog(ML_ERROR, "Invalid arguments for online file check\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>> +	if (!ent) {
>> +		mlog(ML_ERROR,
>> +		"Cannot get the corresponding entry via device basename %s\n",
>> +		kobj->name);
>> +		return -ENODEV;
>> +	}
>> +
>> +	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
>> +		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
>> +		ocfs2_filecheck_sysfs_put(ent);
>> +		return (!ret ? count : ret);
>> +	}
>> +
>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>> +	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>> +		(ent->fs_fcheck->fc_done == 0)) {
>> +		mlog(ML_ERROR,
>> +		"Online file check queue(%u) is full\n",
>> +		ent->fs_fcheck->fc_max);
>> +		ret = -EBUSY;
>> +	} else {
>> +		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>> +			(ent->fs_fcheck->fc_done > 0)) {
>> +			/* Delete the oldest entry which was done,
>> +			 * make sure the entry size in list does
>> +			 * not exceed maximum value
>> +			 */
>> +			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
>> +		}
>> +
>> +		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
>> +		if (entry) {
>> +			entry->fe_ino = args.fa_ino;
>> +			entry->fe_type = args.fa_type;
>> +			entry->fe_done = 0;
>> +			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
>> +			list_add_tail(&entry->fe_list,
>> +					&ent->fs_fcheck->fc_head);
>> +
>> +			ent->fs_fcheck->fc_size++;
>> +			ret = count;
>> +		} else {
>> +			ret = -ENOMEM;
>> +		}
>> +	}
>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>> +
>> +	if (entry)
>> +		ocfs2_filecheck_handle_entry(ent, entry);
>> +
>> +	ocfs2_filecheck_sysfs_put(ent);
>> +	return ret;
>> +}
>> diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
>> new file mode 100644
>> index 0000000..5ec331b
>> --- /dev/null
>> +++ b/fs/ocfs2/filecheck.h
>> @@ -0,0 +1,48 @@
>> +/* -*- mode: c; c-basic-offset: 8; -*-
>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>> + *
>> + * filecheck.h
>> + *
>> + * Online file check.
>> + *
>> + * Copyright (C) 2015 Novell.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public
>> + * License as published by the Free Software Foundation, version 2.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + */
>> +
>> +
>> +#ifndef FILECHECK_H
>> +#define FILECHECK_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/list.h>
>> +
>> +
>> +/* File check errno */
>> +enum {
>> +	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
>> +	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
>> +	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
>> +	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
>> +	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
>> +	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
>> +	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
>> +	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
>> +	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
>> +	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
>> +};
>> +
>> +#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
>> +#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
>> +
>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb);
>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
>> +
>> +#endif  /* FILECHECK_H */
>> diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
>> index 5e86b24..abd1018 100644
>> --- a/fs/ocfs2/inode.h
>> +++ b/fs/ocfs2/inode.h
>> @@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
>>  /* Flags for ocfs2_iget() */
>>  #define OCFS2_FI_FLAG_SYSFILE		0x1
>>  #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
>> +#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
>> +#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
>> +
>>  struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
>>  struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned 
> flags,
>>  			 int sysfile_type);
>> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-03  7:54       ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  7:54 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

Hi Junxiao,

Thank for your reviewing.
Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
Why?
1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
3) in the future, if this feature is well proved, we can add a mount option to make this automatically fix enabled.


Thanks
Gang
   


>>> 
> Hi Gang,
> 
> I didn't see a need to add a sysfs file for the check and repair. This
> leaves a hard problem for customer to decide. How they decide whether
> they should repair the bad inode since this may cause corruption even
> harder?
> I think the error should be fixed by this feature automaticlly if repair
> helps, of course this can be done only when error=continue is enabled or
> add some mount option for it.
> 
> Thanks,
> Junxiao.
> 
> On 10/28/2015 02:25 PM, Gang He wrote:
>> Implement online file check sysfile interfaces, e.g.
>> how to create the related sysfile according to device name,
>> how to display/handle file check request from the sysfile.
>> 
>> Signed-off-by: Gang He <ghe@suse.com>
>> ---
>>  fs/ocfs2/Makefile    |   3 +-
>>  fs/ocfs2/filecheck.c | 566 
> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>  fs/ocfs2/filecheck.h |  48 +++++
>>  fs/ocfs2/inode.h     |   3 +
>>  4 files changed, 619 insertions(+), 1 deletion(-)
>>  create mode 100644 fs/ocfs2/filecheck.c
>>  create mode 100644 fs/ocfs2/filecheck.h
>> 
>> diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
>> index ce210d4..e27e652 100644
>> --- a/fs/ocfs2/Makefile
>> +++ b/fs/ocfs2/Makefile
>> @@ -41,7 +41,8 @@ ocfs2-objs := \
>>  	quota_local.o		\
>>  	quota_global.o		\
>>  	xattr.o			\
>> -	acl.o
>> +	acl.o	\
>> +	filecheck.o
>>  
>>  ocfs2_stackglue-objs := stackglue.o
>>  ocfs2_stack_o2cb-objs := stack_o2cb.o
>> diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
>> new file mode 100644
>> index 0000000..f12ed1f
>> --- /dev/null
>> +++ b/fs/ocfs2/filecheck.c
>> @@ -0,0 +1,566 @@
>> +/* -*- mode: c; c-basic-offset: 8; -*-
>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>> + *
>> + * filecheck.c
>> + *
>> + * Code which implements online file check.
>> + *
>> + * Copyright (C) 2015 Novell.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public
>> + * License as published by the Free Software Foundation, version 2.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + */
>> +
>> +#include <linux/list.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/module.h>
>> +#include <linux/slab.h>
>> +#include <linux/kmod.h>
>> +#include <linux/fs.h>
>> +#include <linux/kobject.h>
>> +#include <linux/sysfs.h>
>> +#include <linux/sysctl.h>
>> +#include <cluster/masklog.h>
>> +
>> +#include "ocfs2.h"
>> +#include "ocfs2_fs.h"
>> +#include "stackglue.h"
>> +#include "inode.h"
>> +
>> +#include "filecheck.h"
>> +
>> +
>> +/* File check error strings,
>> + * must correspond with error number in header file.
>> + */
>> +static const char * const ocfs2_filecheck_errs[] = {
>> +	"SUCCESS",
>> +	"FAILED",
>> +	"INPROGRESS",
>> +	"READONLY",
>> +	"INVALIDINO",
>> +	"BLOCKECC",
>> +	"BLOCKNO",
>> +	"VALIDFLAG",
>> +	"GENERATION",
>> +	"UNSUPPORTED"
>> +};
>> +
>> +static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
>> +static LIST_HEAD(ocfs2_filecheck_sysfs_list);
>> +
>> +struct ocfs2_filecheck {
>> +	struct list_head fc_head;	/* File check entry list head */
>> +	spinlock_t fc_lock;
>> +	unsigned int fc_max;	/* Maximum number of entry in list */
>> +	unsigned int fc_size;	/* Current entry count in list */
>> +	unsigned int fc_done;	/* File check entries are done in list */
>> +};
>> +
>> +struct ocfs2_filecheck_sysfs_entry {
>> +	struct list_head fs_list;
>> +	atomic_t fs_count;
>> +	struct super_block *fs_sb;
>> +	struct kset *fs_kset;
>> +	struct ocfs2_filecheck *fs_fcheck;
>> +};
>> +
>> +#define OCFS2_FILECHECK_MAXSIZE		100
>> +#define OCFS2_FILECHECK_MINSIZE		10
>> +
>> +/* File check operation type */
>> +enum {
>> +	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
>> +	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
>> +	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
>> +};
>> +
>> +struct ocfs2_filecheck_entry {
>> +	struct list_head fe_list;
>> +	unsigned long fe_ino;
>> +	unsigned int fe_type;
>> +	unsigned short fe_done:1;
>> +	unsigned short fe_status:15;
>> +};
>> +
>> +struct ocfs2_filecheck_args {
>> +	unsigned int fa_type;
>> +	union {
>> +		unsigned long fa_ino;
>> +		unsigned int fa_len;
>> +	};
>> +};
>> +
>> +static const char *
>> +ocfs2_filecheck_error(int errno)
>> +{
>> +	if (!errno)
>> +		return ocfs2_filecheck_errs[errno];
>> +
>> +	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
>> +			errno > OCFS2_FILECHECK_ERR_END);
>> +	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
>> +}
>> +
>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>> +					struct kobj_attribute *attr,
>> +					char *buf);
>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>> +					struct kobj_attribute *attr,
>> +					const char *buf, size_t count);
>> +static struct kobj_attribute ocfs2_attr_filecheck =
>> +					__ATTR(filecheck, S_IRUSR | S_IWUSR,
>> +					ocfs2_filecheck_show,
>> +					ocfs2_filecheck_store);
>> +
>> +static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
>> +{
>> +	schedule();
>> +	return 0;
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
>> +{
>> +	struct ocfs2_filecheck_entry *p;
>> +
>> +	if (!atomic_dec_and_test(&entry->fs_count))
>> +		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
>> +						TASK_UNINTERRUPTIBLE);
>> +
>> +	spin_lock(&entry->fs_fcheck->fc_lock);
>> +	while (!list_empty(&entry->fs_fcheck->fc_head)) {
>> +		p = list_first_entry(&entry->fs_fcheck->fc_head,
>> +				struct ocfs2_filecheck_entry, fe_list);
>> +		list_del(&p->fe_list);
>> +		BUG_ON(!p->fe_done); /* To free a undone file check entry */
>> +		kfree(p);
>> +	}
>> +	spin_unlock(&entry->fs_fcheck->fc_lock);
>> +
>> +	kset_unregister(entry->fs_kset);
>> +	kfree(entry->fs_fcheck);
>> +	kfree(entry);
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
>> +{
>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>> +	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +}
>> +
>> +static int ocfs2_filecheck_sysfs_del(const char *devname)
>> +{
>> +	struct ocfs2_filecheck_sysfs_entry *p;
>> +
>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>> +			list_del(&p->fs_list);
>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +			ocfs2_filecheck_sysfs_free(p);
>> +			return 0;
>> +		}
>> +	}
>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +	return 1;
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
>> +{
>> +	if (atomic_dec_and_test(&entry->fs_count))
>> +		wake_up_atomic_t(&entry->fs_count);
>> +}
>> +
>> +static struct ocfs2_filecheck_sysfs_entry *
>> +ocfs2_filecheck_sysfs_get(const char *devname)
>> +{
>> +	struct ocfs2_filecheck_sysfs_entry *p = NULL;
>> +
>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>> +			atomic_inc(&p->fs_count);
>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +			return p;
>> +		}
>> +	}
>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>> +	return NULL;
>> +}
>> +
>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb)
>> +{
>> +	int ret = 0;
>> +	struct kset *ocfs2_filecheck_kset = NULL;
>> +	struct ocfs2_filecheck *fcheck = NULL;
>> +	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
>> +	struct attribute **attrs = NULL;
>> +	struct attribute_group attrgp;
>> +
>> +	if (!ocfs2_kset)
>> +		return -ENOMEM;
>> +
>> +	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
>> +	if (!attrs) {
>> +		ret = -ENOMEM;
>> +		goto error;
>> +	} else {
>> +		attrs[0] = &ocfs2_attr_filecheck.attr;
>> +		attrs[1] = NULL;
>> +		memset(&attrgp, 0, sizeof(attrgp));
>> +		attrgp.attrs = attrs;
>> +	}
>> +
>> +	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
>> +	if (!fcheck) {
>> +		ret = -ENOMEM;
>> +		goto error;
>> +	} else {
>> +		INIT_LIST_HEAD(&fcheck->fc_head);
>> +		spin_lock_init(&fcheck->fc_lock);
>> +		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
>> +		fcheck->fc_size = 0;
>> +		fcheck->fc_done = 0;
>> +	}
>> +
>> +	if (strlen(sb->s_id) <= 0) {
>> +		mlog(ML_ERROR,
>> +		"Cannot get device basename when create filecheck sysfs\n");
>> +		ret = -ENODEV;
>> +		goto error;
>> +	}
>> +
>> +	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
>> +						&ocfs2_kset->kobj);
>> +	if (!ocfs2_filecheck_kset) {
>> +		ret = -ENOMEM;
>> +		goto error;
>> +	}
>> +
>> +	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
>> +	if (ret)
>> +		goto error;
>> +
>> +	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
>> +	if (!entry) {
>> +		ret = -ENOMEM;
>> +		goto error;
>> +	} else {
>> +		atomic_set(&entry->fs_count, 1);
>> +		entry->fs_sb = sb;
>> +		entry->fs_kset = ocfs2_filecheck_kset;
>> +		entry->fs_fcheck = fcheck;
>> +		ocfs2_filecheck_sysfs_add(entry);
>> +	}
>> +
>> +	kfree(attrs);
>> +	return 0;
>> +
>> +error:
>> +	kfree(attrs);
>> +	kfree(entry);
>> +	kfree(fcheck);
>> +	kset_unregister(ocfs2_filecheck_kset);
>> +	return ret;
>> +}
>> +
>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
>> +{
>> +	return ocfs2_filecheck_sysfs_del(sb->s_id);
>> +}
>> +
>> +static int
>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				unsigned int count);
>> +static int
>> +ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				unsigned int len)
>> +{
>> +	int ret;
>> +
>> +	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
>> +		return -EINVAL;
>> +
>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>> +	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
>> +		mlog(ML_ERROR,
>> +		"Cannot set online file check maximum entry number "
>> +		"to %u due to too much pending entries(%u)\n",
>> +		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
>> +		ret = -EBUSY;
>> +	} else {
>> +		if (len < ent->fs_fcheck->fc_size)
>> +			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
>> +				ent->fs_fcheck->fc_size - len));
>> +
>> +		ent->fs_fcheck->fc_max = len;
>> +		ret = 0;
>> +	}
>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>> +
>> +	return ret;
>> +}
>> +
>> +#define OCFS2_FILECHECK_ARGS_LEN	32
>> +static int
>> +ocfs2_filecheck_args_get_long(const char *buf, size_t count,
>> +				unsigned long *val)
>> +{
>> +	char buffer[OCFS2_FILECHECK_ARGS_LEN];
>> +
>> +	if (count < 1)
>> +		return 1;
>> +
>> +	memcpy(buffer, buf, count);
>> +	buffer[count] = '\0';
>> +
>> +	if (kstrtoul(buffer, 0, val))
>> +		return 1;
>> +
>> +	return 0;
>> +}
>> +
>> +static int
>> +ocfs2_filecheck_args_parse(const char *buf, size_t count,
>> +				struct ocfs2_filecheck_args *args)
>> +{
>> +	unsigned long val = 0;
>> +
>> +	/* too short/long args length */
>> +	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
>> +		return 1;
>> +
>> +	if (!strncasecmp(buf, "FIX ", 4)) {
>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>> +			return 1;
>> +
>> +		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
>> +		args->fa_ino = val;
>> +		return 0;
>> +	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
>> +		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
>> +			return 1;
>> +
>> +		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
>> +		args->fa_ino = val;
>> +		return 0;
>> +	} else if (!strncasecmp(buf, "SET ", 4)) {
>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>> +			return 1;
>> +
>> +		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
>> +		args->fa_len = (unsigned int)val;
>> +		return 0;
>> +	} else { /* invalid args */
>> +		return 1;
>> +	}
>> +}
>> +
>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>> +					struct kobj_attribute *attr,
>> +					char *buf)
>> +{
>> +
>> +	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
>> +	struct ocfs2_filecheck_entry *p;
>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>> +
>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>> +	if (!ent) {
>> +		mlog(ML_ERROR,
>> +		"Cannot get the corresponding entry via device basename %s\n",
>> +		kobj->name);
>> +		return -ENODEV;
>> +	}
>> +
>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>> +	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
>> +	total += ret;
>> +	remain -= ret;
>> +
>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>> +		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
>> +			p->fe_ino, p->fe_type, p->fe_done,
>> +			ocfs2_filecheck_error(p->fe_status));
>> +		if (ret < 0) {
>> +			total = ret;
>> +			break;
>> +		}
>> +		if (ret == remain) {
>> +			/* snprintf() didn't fit */
>> +			total = -E2BIG;
>> +			break;
>> +		}
>> +		total += ret;
>> +		remain -= ret;
>> +	}
>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>> +
>> +	ocfs2_filecheck_sysfs_put(ent);
>> +	return total;
>> +}
>> +
>> +static int
>> +ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
>> +{
>> +	struct ocfs2_filecheck_entry *p;
>> +
>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>> +		if (p->fe_done) {
>> +			list_del(&p->fe_list);
>> +			kfree(p);
>> +			ent->fs_fcheck->fc_size--;
>> +			ent->fs_fcheck->fc_done--;
>> +			return 1;
>> +		}
>> +	}
>> +
>> +	return 0;
>> +}
>> +
>> +static int
>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				unsigned int count)
>> +{
>> +	unsigned int i = 0;
>> +	unsigned int ret = 0;
>> +
>> +	while (i++ < count) {
>> +		if (ocfs2_filecheck_erase_entry(ent))
>> +			ret++;
>> +		else
>> +			break;
>> +	}
>> +
>> +	return (ret == count ? 1 : 0);
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				struct ocfs2_filecheck_entry *entry)
>> +{
>> +	entry->fe_done = 1;
>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>> +	ent->fs_fcheck->fc_done++;
>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>> +}
>> +
>> +static unsigned short
>> +ocfs2_filecheck_handle(struct super_block *sb,
>> +				unsigned long ino, unsigned int flags)
>> +{
>> +	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
>> +	struct inode *inode = NULL;
>> +	int rc;
>> +
>> +	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
>> +	if (IS_ERR(inode)) {
>> +		rc = (int)(-(long)inode);
>> +		if (rc >= OCFS2_FILECHECK_ERR_START &&
>> +			rc < OCFS2_FILECHECK_ERR_END)
>> +			ret = rc;
>> +		else
>> +			ret = OCFS2_FILECHECK_ERR_FAILED;
>> +	} else
>> +		iput(inode);
>> +
>> +	return ret;
>> +}
>> +
>> +static void
>> +ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>> +				struct ocfs2_filecheck_entry *entry)
>> +{
>> +	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
>> +	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
>> +	else
>> +		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
>> +
>> +	ocfs2_filecheck_done_entry(ent, entry);
>> +}
>> +
>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>> +				struct kobj_attribute *attr,
>> +				const char *buf, size_t count)
>> +{
>> +	struct ocfs2_filecheck_args args;
>> +	struct ocfs2_filecheck_entry *entry = NULL;
>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>> +	ssize_t ret = 0;
>> +
>> +	if (count == 0)
>> +		return count;
>> +
>> +	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
>> +		mlog(ML_ERROR, "Invalid arguments for online file check\n");
>> +		return -EINVAL;
>> +	}
>> +
>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>> +	if (!ent) {
>> +		mlog(ML_ERROR,
>> +		"Cannot get the corresponding entry via device basename %s\n",
>> +		kobj->name);
>> +		return -ENODEV;
>> +	}
>> +
>> +	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
>> +		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
>> +		ocfs2_filecheck_sysfs_put(ent);
>> +		return (!ret ? count : ret);
>> +	}
>> +
>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>> +	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>> +		(ent->fs_fcheck->fc_done == 0)) {
>> +		mlog(ML_ERROR,
>> +		"Online file check queue(%u) is full\n",
>> +		ent->fs_fcheck->fc_max);
>> +		ret = -EBUSY;
>> +	} else {
>> +		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>> +			(ent->fs_fcheck->fc_done > 0)) {
>> +			/* Delete the oldest entry which was done,
>> +			 * make sure the entry size in list does
>> +			 * not exceed maximum value
>> +			 */
>> +			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
>> +		}
>> +
>> +		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
>> +		if (entry) {
>> +			entry->fe_ino = args.fa_ino;
>> +			entry->fe_type = args.fa_type;
>> +			entry->fe_done = 0;
>> +			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
>> +			list_add_tail(&entry->fe_list,
>> +					&ent->fs_fcheck->fc_head);
>> +
>> +			ent->fs_fcheck->fc_size++;
>> +			ret = count;
>> +		} else {
>> +			ret = -ENOMEM;
>> +		}
>> +	}
>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>> +
>> +	if (entry)
>> +		ocfs2_filecheck_handle_entry(ent, entry);
>> +
>> +	ocfs2_filecheck_sysfs_put(ent);
>> +	return ret;
>> +}
>> diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
>> new file mode 100644
>> index 0000000..5ec331b
>> --- /dev/null
>> +++ b/fs/ocfs2/filecheck.h
>> @@ -0,0 +1,48 @@
>> +/* -*- mode: c; c-basic-offset: 8; -*-
>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>> + *
>> + * filecheck.h
>> + *
>> + * Online file check.
>> + *
>> + * Copyright (C) 2015 Novell.  All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public
>> + * License as published by the Free Software Foundation, version 2.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>> + * General Public License for more details.
>> + */
>> +
>> +
>> +#ifndef FILECHECK_H
>> +#define FILECHECK_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/list.h>
>> +
>> +
>> +/* File check errno */
>> +enum {
>> +	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
>> +	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
>> +	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
>> +	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
>> +	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
>> +	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
>> +	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
>> +	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
>> +	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
>> +	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
>> +};
>> +
>> +#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
>> +#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
>> +
>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb);
>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
>> +
>> +#endif  /* FILECHECK_H */
>> diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
>> index 5e86b24..abd1018 100644
>> --- a/fs/ocfs2/inode.h
>> +++ b/fs/ocfs2/inode.h
>> @@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
>>  /* Flags for ocfs2_iget() */
>>  #define OCFS2_FI_FLAG_SYSFILE		0x1
>>  #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
>> +#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
>> +#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
>> +
>>  struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
>>  struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned 
> flags,
>>  			 int sysfile_type);
>> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-11-03  7:12     ` [Ocfs2-devel] " Junxiao Bi
@ 2015-11-03  8:15       ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  8:15 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

Hello Junxiao,

See my comments inline.


>>> 
> Hi Gang,
> 
> This is not like a right patch.
> First, online file check only checks inode's block number, valid flag,
> fs generation value, and meta ecc. I never see a real corruption
> happened only on this field, if these fields are corrupted, that means
> something bad may happen on other place. So fix this field may not help
> and even cause corruption more hard.
This online file check/fix feature is used to check/fix some light file meta block corruption, instead of turning a file system off and using fsck.ocfs2.
e.g. meta ecc error, we really need not to use fsck.ocfs2. 
of course, this feature does not replace fsck.ocfs2 and touch some complicated meta block problems, if there is some potential problem in some areas, we can discuss them one by one.



> Second, the repair way is wrong. In
> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
> match the ones in memory, the ones in memory are used to update the disk
> fields. The question is how do you know these field in memory are
> right(they may be the real corrupted ones)?
Here, if the inode block was corrupted, the file system is not able to load it into the memory.
ocfs2_filecheck_repair_inode_block() will able to load it into the memory, since it try to fix these light-level problem before loading.
if the fix is OK, the changed meta-block can pass the block-validate function and load into the memory as a inode object.
Since the file system is under a cluster environment, we have to use some existing function and code path to keep these block operation under a cluster lock.


Thanks
Gang

> 
> Thanks,
> Junxiao.
> On 10/28/2015 02:26 PM, Gang He wrote:
>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>> +			       struct buffer_head *bh)
>> +{
>> +	int rc;
>> +	int changed = 0;
>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>> +
>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>> +	/* Can't fix invalid inode block */
>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>> +		return rc;
>> +
>> +	trace_ocfs2_filecheck_repair_inode_block(
>> +		(unsigned long long)bh->b_blocknr);
>> +
>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>> +		mlog(ML_ERROR,
>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>> +			(unsigned long long)bh->b_blocknr);
>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>> +	}
>> +
>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>> +		changed = 1;
>> +		mlog(ML_ERROR,
>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>> +			(unsigned long long)bh->b_blocknr,
>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>> +	}
>> +
>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>> +		changed = 1;
>> +		mlog(ML_ERROR,
>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>> +			(unsigned long long)bh->b_blocknr);
>> +	}
>> +
>> +	if (le32_to_cpu(di->i_fs_generation) !=
>> +	    OCFS2_SB(sb)->fs_generation) {
>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>> +		changed = 1;
>> +		mlog(ML_ERROR,
>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>> +			(unsigned long long)bh->b_blocknr,
>> +			le32_to_cpu(di->i_fs_generation));
>> +	}
>> +
>> +	if (changed ||
>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>> +		mark_buffer_dirty(bh);
>> +		mlog(ML_ERROR,
>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>> +			(unsigned long long)bh->b_blocknr);
>> +	}
>> +
>> +	return 0;
>> +}


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-03  8:15       ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  8:15 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

Hello Junxiao,

See my comments inline.


>>> 
> Hi Gang,
> 
> This is not like a right patch.
> First, online file check only checks inode's block number, valid flag,
> fs generation value, and meta ecc. I never see a real corruption
> happened only on this field, if these fields are corrupted, that means
> something bad may happen on other place. So fix this field may not help
> and even cause corruption more hard.
This online file check/fix feature is used to check/fix some light file meta block corruption, instead of turning a file system off and using fsck.ocfs2.
e.g. meta ecc error, we really need not to use fsck.ocfs2. 
of course, this feature does not replace fsck.ocfs2 and touch some complicated meta block problems, if there is some potential problem in some areas, we can discuss them one by one.



> Second, the repair way is wrong. In
> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
> match the ones in memory, the ones in memory are used to update the disk
> fields. The question is how do you know these field in memory are
> right(they may be the real corrupted ones)?
Here, if the inode block was corrupted, the file system is not able to load it into the memory.
ocfs2_filecheck_repair_inode_block() will able to load it into the memory, since it try to fix these light-level problem before loading.
if the fix is OK, the changed meta-block can pass the block-validate function and load into the memory as a inode object.
Since the file system is under a cluster environment, we have to use some existing function and code path to keep these block operation under a cluster lock.


Thanks
Gang

> 
> Thanks,
> Junxiao.
> On 10/28/2015 02:26 PM, Gang He wrote:
>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>> +			       struct buffer_head *bh)
>> +{
>> +	int rc;
>> +	int changed = 0;
>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>> +
>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>> +	/* Can't fix invalid inode block */
>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>> +		return rc;
>> +
>> +	trace_ocfs2_filecheck_repair_inode_block(
>> +		(unsigned long long)bh->b_blocknr);
>> +
>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>> +		mlog(ML_ERROR,
>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>> +			(unsigned long long)bh->b_blocknr);
>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>> +	}
>> +
>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>> +		changed = 1;
>> +		mlog(ML_ERROR,
>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>> +			(unsigned long long)bh->b_blocknr,
>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>> +	}
>> +
>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>> +		changed = 1;
>> +		mlog(ML_ERROR,
>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>> +			(unsigned long long)bh->b_blocknr);
>> +	}
>> +
>> +	if (le32_to_cpu(di->i_fs_generation) !=
>> +	    OCFS2_SB(sb)->fs_generation) {
>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>> +		changed = 1;
>> +		mlog(ML_ERROR,
>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>> +			(unsigned long long)bh->b_blocknr,
>> +			le32_to_cpu(di->i_fs_generation));
>> +	}
>> +
>> +	if (changed ||
>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>> +		mark_buffer_dirty(bh);
>> +		mlog(ML_ERROR,
>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>> +			(unsigned long long)bh->b_blocknr);
>> +	}
>> +
>> +	return 0;
>> +}

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-03  7:54       ` [Ocfs2-devel] " Gang He
@ 2015-11-03  8:20         ` Junxiao Bi
  -1 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  8:20 UTC (permalink / raw)
  To: Gang He, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

Hi Gang,

On 11/03/2015 03:54 PM, Gang He wrote:
> Hi Junxiao,
> 
> Thank for your reviewing.
> Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
> But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
> Why?
> 1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
If user don't want this, they should not use error=continue option, let
fs go after a corruption is very dangerous.
> 2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
I think if this feature could bring more corruption, then this should be
fixed first.

Thanks,
Junxiao
> 3) in the future, if this feature is well proved, we can add a mount option to make this automatically fix enabled.
> 
> 
> Thanks
> Gang
>    
> 
> 
>>>>
>> Hi Gang,
>>
>> I didn't see a need to add a sysfs file for the check and repair. This
>> leaves a hard problem for customer to decide. How they decide whether
>> they should repair the bad inode since this may cause corruption even
>> harder?
>> I think the error should be fixed by this feature automaticlly if repair
>> helps, of course this can be done only when error=continue is enabled or
>> add some mount option for it.
>>
>> Thanks,
>> Junxiao.
>>
>> On 10/28/2015 02:25 PM, Gang He wrote:
>>> Implement online file check sysfile interfaces, e.g.
>>> how to create the related sysfile according to device name,
>>> how to display/handle file check request from the sysfile.
>>>
>>> Signed-off-by: Gang He <ghe@suse.com>
>>> ---
>>>  fs/ocfs2/Makefile    |   3 +-
>>>  fs/ocfs2/filecheck.c | 566 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  fs/ocfs2/filecheck.h |  48 +++++
>>>  fs/ocfs2/inode.h     |   3 +
>>>  4 files changed, 619 insertions(+), 1 deletion(-)
>>>  create mode 100644 fs/ocfs2/filecheck.c
>>>  create mode 100644 fs/ocfs2/filecheck.h
>>>
>>> diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
>>> index ce210d4..e27e652 100644
>>> --- a/fs/ocfs2/Makefile
>>> +++ b/fs/ocfs2/Makefile
>>> @@ -41,7 +41,8 @@ ocfs2-objs := \
>>>  	quota_local.o		\
>>>  	quota_global.o		\
>>>  	xattr.o			\
>>> -	acl.o
>>> +	acl.o	\
>>> +	filecheck.o
>>>  
>>>  ocfs2_stackglue-objs := stackglue.o
>>>  ocfs2_stack_o2cb-objs := stack_o2cb.o
>>> diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
>>> new file mode 100644
>>> index 0000000..f12ed1f
>>> --- /dev/null
>>> +++ b/fs/ocfs2/filecheck.c
>>> @@ -0,0 +1,566 @@
>>> +/* -*- mode: c; c-basic-offset: 8; -*-
>>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>>> + *
>>> + * filecheck.c
>>> + *
>>> + * Code which implements online file check.
>>> + *
>>> + * Copyright (C) 2015 Novell.  All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU General Public
>>> + * License as published by the Free Software Foundation, version 2.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> + * General Public License for more details.
>>> + */
>>> +
>>> +#include <linux/list.h>
>>> +#include <linux/spinlock.h>
>>> +#include <linux/module.h>
>>> +#include <linux/slab.h>
>>> +#include <linux/kmod.h>
>>> +#include <linux/fs.h>
>>> +#include <linux/kobject.h>
>>> +#include <linux/sysfs.h>
>>> +#include <linux/sysctl.h>
>>> +#include <cluster/masklog.h>
>>> +
>>> +#include "ocfs2.h"
>>> +#include "ocfs2_fs.h"
>>> +#include "stackglue.h"
>>> +#include "inode.h"
>>> +
>>> +#include "filecheck.h"
>>> +
>>> +
>>> +/* File check error strings,
>>> + * must correspond with error number in header file.
>>> + */
>>> +static const char * const ocfs2_filecheck_errs[] = {
>>> +	"SUCCESS",
>>> +	"FAILED",
>>> +	"INPROGRESS",
>>> +	"READONLY",
>>> +	"INVALIDINO",
>>> +	"BLOCKECC",
>>> +	"BLOCKNO",
>>> +	"VALIDFLAG",
>>> +	"GENERATION",
>>> +	"UNSUPPORTED"
>>> +};
>>> +
>>> +static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
>>> +static LIST_HEAD(ocfs2_filecheck_sysfs_list);
>>> +
>>> +struct ocfs2_filecheck {
>>> +	struct list_head fc_head;	/* File check entry list head */
>>> +	spinlock_t fc_lock;
>>> +	unsigned int fc_max;	/* Maximum number of entry in list */
>>> +	unsigned int fc_size;	/* Current entry count in list */
>>> +	unsigned int fc_done;	/* File check entries are done in list */
>>> +};
>>> +
>>> +struct ocfs2_filecheck_sysfs_entry {
>>> +	struct list_head fs_list;
>>> +	atomic_t fs_count;
>>> +	struct super_block *fs_sb;
>>> +	struct kset *fs_kset;
>>> +	struct ocfs2_filecheck *fs_fcheck;
>>> +};
>>> +
>>> +#define OCFS2_FILECHECK_MAXSIZE		100
>>> +#define OCFS2_FILECHECK_MINSIZE		10
>>> +
>>> +/* File check operation type */
>>> +enum {
>>> +	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
>>> +	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
>>> +	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
>>> +};
>>> +
>>> +struct ocfs2_filecheck_entry {
>>> +	struct list_head fe_list;
>>> +	unsigned long fe_ino;
>>> +	unsigned int fe_type;
>>> +	unsigned short fe_done:1;
>>> +	unsigned short fe_status:15;
>>> +};
>>> +
>>> +struct ocfs2_filecheck_args {
>>> +	unsigned int fa_type;
>>> +	union {
>>> +		unsigned long fa_ino;
>>> +		unsigned int fa_len;
>>> +	};
>>> +};
>>> +
>>> +static const char *
>>> +ocfs2_filecheck_error(int errno)
>>> +{
>>> +	if (!errno)
>>> +		return ocfs2_filecheck_errs[errno];
>>> +
>>> +	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
>>> +			errno > OCFS2_FILECHECK_ERR_END);
>>> +	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
>>> +}
>>> +
>>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>>> +					struct kobj_attribute *attr,
>>> +					char *buf);
>>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>>> +					struct kobj_attribute *attr,
>>> +					const char *buf, size_t count);
>>> +static struct kobj_attribute ocfs2_attr_filecheck =
>>> +					__ATTR(filecheck, S_IRUSR | S_IWUSR,
>>> +					ocfs2_filecheck_show,
>>> +					ocfs2_filecheck_store);
>>> +
>>> +static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
>>> +{
>>> +	schedule();
>>> +	return 0;
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
>>> +{
>>> +	struct ocfs2_filecheck_entry *p;
>>> +
>>> +	if (!atomic_dec_and_test(&entry->fs_count))
>>> +		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
>>> +						TASK_UNINTERRUPTIBLE);
>>> +
>>> +	spin_lock(&entry->fs_fcheck->fc_lock);
>>> +	while (!list_empty(&entry->fs_fcheck->fc_head)) {
>>> +		p = list_first_entry(&entry->fs_fcheck->fc_head,
>>> +				struct ocfs2_filecheck_entry, fe_list);
>>> +		list_del(&p->fe_list);
>>> +		BUG_ON(!p->fe_done); /* To free a undone file check entry */
>>> +		kfree(p);
>>> +	}
>>> +	spin_unlock(&entry->fs_fcheck->fc_lock);
>>> +
>>> +	kset_unregister(entry->fs_kset);
>>> +	kfree(entry->fs_fcheck);
>>> +	kfree(entry);
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
>>> +{
>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>> +	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +}
>>> +
>>> +static int ocfs2_filecheck_sysfs_del(const char *devname)
>>> +{
>>> +	struct ocfs2_filecheck_sysfs_entry *p;
>>> +
>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>>> +			list_del(&p->fs_list);
>>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +			ocfs2_filecheck_sysfs_free(p);
>>> +			return 0;
>>> +		}
>>> +	}
>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +	return 1;
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
>>> +{
>>> +	if (atomic_dec_and_test(&entry->fs_count))
>>> +		wake_up_atomic_t(&entry->fs_count);
>>> +}
>>> +
>>> +static struct ocfs2_filecheck_sysfs_entry *
>>> +ocfs2_filecheck_sysfs_get(const char *devname)
>>> +{
>>> +	struct ocfs2_filecheck_sysfs_entry *p = NULL;
>>> +
>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>>> +			atomic_inc(&p->fs_count);
>>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +			return p;
>>> +		}
>>> +	}
>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +	return NULL;
>>> +}
>>> +
>>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb)
>>> +{
>>> +	int ret = 0;
>>> +	struct kset *ocfs2_filecheck_kset = NULL;
>>> +	struct ocfs2_filecheck *fcheck = NULL;
>>> +	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
>>> +	struct attribute **attrs = NULL;
>>> +	struct attribute_group attrgp;
>>> +
>>> +	if (!ocfs2_kset)
>>> +		return -ENOMEM;
>>> +
>>> +	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
>>> +	if (!attrs) {
>>> +		ret = -ENOMEM;
>>> +		goto error;
>>> +	} else {
>>> +		attrs[0] = &ocfs2_attr_filecheck.attr;
>>> +		attrs[1] = NULL;
>>> +		memset(&attrgp, 0, sizeof(attrgp));
>>> +		attrgp.attrs = attrs;
>>> +	}
>>> +
>>> +	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
>>> +	if (!fcheck) {
>>> +		ret = -ENOMEM;
>>> +		goto error;
>>> +	} else {
>>> +		INIT_LIST_HEAD(&fcheck->fc_head);
>>> +		spin_lock_init(&fcheck->fc_lock);
>>> +		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
>>> +		fcheck->fc_size = 0;
>>> +		fcheck->fc_done = 0;
>>> +	}
>>> +
>>> +	if (strlen(sb->s_id) <= 0) {
>>> +		mlog(ML_ERROR,
>>> +		"Cannot get device basename when create filecheck sysfs\n");
>>> +		ret = -ENODEV;
>>> +		goto error;
>>> +	}
>>> +
>>> +	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
>>> +						&ocfs2_kset->kobj);
>>> +	if (!ocfs2_filecheck_kset) {
>>> +		ret = -ENOMEM;
>>> +		goto error;
>>> +	}
>>> +
>>> +	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
>>> +	if (ret)
>>> +		goto error;
>>> +
>>> +	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
>>> +	if (!entry) {
>>> +		ret = -ENOMEM;
>>> +		goto error;
>>> +	} else {
>>> +		atomic_set(&entry->fs_count, 1);
>>> +		entry->fs_sb = sb;
>>> +		entry->fs_kset = ocfs2_filecheck_kset;
>>> +		entry->fs_fcheck = fcheck;
>>> +		ocfs2_filecheck_sysfs_add(entry);
>>> +	}
>>> +
>>> +	kfree(attrs);
>>> +	return 0;
>>> +
>>> +error:
>>> +	kfree(attrs);
>>> +	kfree(entry);
>>> +	kfree(fcheck);
>>> +	kset_unregister(ocfs2_filecheck_kset);
>>> +	return ret;
>>> +}
>>> +
>>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
>>> +{
>>> +	return ocfs2_filecheck_sysfs_del(sb->s_id);
>>> +}
>>> +
>>> +static int
>>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				unsigned int count);
>>> +static int
>>> +ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				unsigned int len)
>>> +{
>>> +	int ret;
>>> +
>>> +	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
>>> +		return -EINVAL;
>>> +
>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>> +	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
>>> +		mlog(ML_ERROR,
>>> +		"Cannot set online file check maximum entry number "
>>> +		"to %u due to too much pending entries(%u)\n",
>>> +		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
>>> +		ret = -EBUSY;
>>> +	} else {
>>> +		if (len < ent->fs_fcheck->fc_size)
>>> +			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
>>> +				ent->fs_fcheck->fc_size - len));
>>> +
>>> +		ent->fs_fcheck->fc_max = len;
>>> +		ret = 0;
>>> +	}
>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +#define OCFS2_FILECHECK_ARGS_LEN	32
>>> +static int
>>> +ocfs2_filecheck_args_get_long(const char *buf, size_t count,
>>> +				unsigned long *val)
>>> +{
>>> +	char buffer[OCFS2_FILECHECK_ARGS_LEN];
>>> +
>>> +	if (count < 1)
>>> +		return 1;
>>> +
>>> +	memcpy(buffer, buf, count);
>>> +	buffer[count] = '\0';
>>> +
>>> +	if (kstrtoul(buffer, 0, val))
>>> +		return 1;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int
>>> +ocfs2_filecheck_args_parse(const char *buf, size_t count,
>>> +				struct ocfs2_filecheck_args *args)
>>> +{
>>> +	unsigned long val = 0;
>>> +
>>> +	/* too short/long args length */
>>> +	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
>>> +		return 1;
>>> +
>>> +	if (!strncasecmp(buf, "FIX ", 4)) {
>>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>>> +			return 1;
>>> +
>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
>>> +		args->fa_ino = val;
>>> +		return 0;
>>> +	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
>>> +		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
>>> +			return 1;
>>> +
>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
>>> +		args->fa_ino = val;
>>> +		return 0;
>>> +	} else if (!strncasecmp(buf, "SET ", 4)) {
>>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>>> +			return 1;
>>> +
>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
>>> +		args->fa_len = (unsigned int)val;
>>> +		return 0;
>>> +	} else { /* invalid args */
>>> +		return 1;
>>> +	}
>>> +}
>>> +
>>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>>> +					struct kobj_attribute *attr,
>>> +					char *buf)
>>> +{
>>> +
>>> +	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
>>> +	struct ocfs2_filecheck_entry *p;
>>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>>> +
>>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>>> +	if (!ent) {
>>> +		mlog(ML_ERROR,
>>> +		"Cannot get the corresponding entry via device basename %s\n",
>>> +		kobj->name);
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>> +	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
>>> +	total += ret;
>>> +	remain -= ret;
>>> +
>>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>>> +		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
>>> +			p->fe_ino, p->fe_type, p->fe_done,
>>> +			ocfs2_filecheck_error(p->fe_status));
>>> +		if (ret < 0) {
>>> +			total = ret;
>>> +			break;
>>> +		}
>>> +		if (ret == remain) {
>>> +			/* snprintf() didn't fit */
>>> +			total = -E2BIG;
>>> +			break;
>>> +		}
>>> +		total += ret;
>>> +		remain -= ret;
>>> +	}
>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>> +
>>> +	ocfs2_filecheck_sysfs_put(ent);
>>> +	return total;
>>> +}
>>> +
>>> +static int
>>> +ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
>>> +{
>>> +	struct ocfs2_filecheck_entry *p;
>>> +
>>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>>> +		if (p->fe_done) {
>>> +			list_del(&p->fe_list);
>>> +			kfree(p);
>>> +			ent->fs_fcheck->fc_size--;
>>> +			ent->fs_fcheck->fc_done--;
>>> +			return 1;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int
>>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				unsigned int count)
>>> +{
>>> +	unsigned int i = 0;
>>> +	unsigned int ret = 0;
>>> +
>>> +	while (i++ < count) {
>>> +		if (ocfs2_filecheck_erase_entry(ent))
>>> +			ret++;
>>> +		else
>>> +			break;
>>> +	}
>>> +
>>> +	return (ret == count ? 1 : 0);
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				struct ocfs2_filecheck_entry *entry)
>>> +{
>>> +	entry->fe_done = 1;
>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>> +	ent->fs_fcheck->fc_done++;
>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>> +}
>>> +
>>> +static unsigned short
>>> +ocfs2_filecheck_handle(struct super_block *sb,
>>> +				unsigned long ino, unsigned int flags)
>>> +{
>>> +	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
>>> +	struct inode *inode = NULL;
>>> +	int rc;
>>> +
>>> +	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
>>> +	if (IS_ERR(inode)) {
>>> +		rc = (int)(-(long)inode);
>>> +		if (rc >= OCFS2_FILECHECK_ERR_START &&
>>> +			rc < OCFS2_FILECHECK_ERR_END)
>>> +			ret = rc;
>>> +		else
>>> +			ret = OCFS2_FILECHECK_ERR_FAILED;
>>> +	} else
>>> +		iput(inode);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				struct ocfs2_filecheck_entry *entry)
>>> +{
>>> +	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
>>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
>>> +	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
>>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
>>> +	else
>>> +		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
>>> +
>>> +	ocfs2_filecheck_done_entry(ent, entry);
>>> +}
>>> +
>>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>>> +				struct kobj_attribute *attr,
>>> +				const char *buf, size_t count)
>>> +{
>>> +	struct ocfs2_filecheck_args args;
>>> +	struct ocfs2_filecheck_entry *entry = NULL;
>>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>>> +	ssize_t ret = 0;
>>> +
>>> +	if (count == 0)
>>> +		return count;
>>> +
>>> +	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
>>> +		mlog(ML_ERROR, "Invalid arguments for online file check\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>>> +	if (!ent) {
>>> +		mlog(ML_ERROR,
>>> +		"Cannot get the corresponding entry via device basename %s\n",
>>> +		kobj->name);
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
>>> +		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
>>> +		ocfs2_filecheck_sysfs_put(ent);
>>> +		return (!ret ? count : ret);
>>> +	}
>>> +
>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>> +	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>>> +		(ent->fs_fcheck->fc_done == 0)) {
>>> +		mlog(ML_ERROR,
>>> +		"Online file check queue(%u) is full\n",
>>> +		ent->fs_fcheck->fc_max);
>>> +		ret = -EBUSY;
>>> +	} else {
>>> +		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>>> +			(ent->fs_fcheck->fc_done > 0)) {
>>> +			/* Delete the oldest entry which was done,
>>> +			 * make sure the entry size in list does
>>> +			 * not exceed maximum value
>>> +			 */
>>> +			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
>>> +		}
>>> +
>>> +		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
>>> +		if (entry) {
>>> +			entry->fe_ino = args.fa_ino;
>>> +			entry->fe_type = args.fa_type;
>>> +			entry->fe_done = 0;
>>> +			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
>>> +			list_add_tail(&entry->fe_list,
>>> +					&ent->fs_fcheck->fc_head);
>>> +
>>> +			ent->fs_fcheck->fc_size++;
>>> +			ret = count;
>>> +		} else {
>>> +			ret = -ENOMEM;
>>> +		}
>>> +	}
>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>> +
>>> +	if (entry)
>>> +		ocfs2_filecheck_handle_entry(ent, entry);
>>> +
>>> +	ocfs2_filecheck_sysfs_put(ent);
>>> +	return ret;
>>> +}
>>> diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
>>> new file mode 100644
>>> index 0000000..5ec331b
>>> --- /dev/null
>>> +++ b/fs/ocfs2/filecheck.h
>>> @@ -0,0 +1,48 @@
>>> +/* -*- mode: c; c-basic-offset: 8; -*-
>>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>>> + *
>>> + * filecheck.h
>>> + *
>>> + * Online file check.
>>> + *
>>> + * Copyright (C) 2015 Novell.  All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU General Public
>>> + * License as published by the Free Software Foundation, version 2.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> + * General Public License for more details.
>>> + */
>>> +
>>> +
>>> +#ifndef FILECHECK_H
>>> +#define FILECHECK_H
>>> +
>>> +#include <linux/types.h>
>>> +#include <linux/list.h>
>>> +
>>> +
>>> +/* File check errno */
>>> +enum {
>>> +	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
>>> +	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
>>> +	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
>>> +	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
>>> +	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
>>> +	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
>>> +	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
>>> +	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
>>> +	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
>>> +	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
>>> +};
>>> +
>>> +#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
>>> +#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
>>> +
>>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb);
>>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
>>> +
>>> +#endif  /* FILECHECK_H */
>>> diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
>>> index 5e86b24..abd1018 100644
>>> --- a/fs/ocfs2/inode.h
>>> +++ b/fs/ocfs2/inode.h
>>> @@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
>>>  /* Flags for ocfs2_iget() */
>>>  #define OCFS2_FI_FLAG_SYSFILE		0x1
>>>  #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
>>> +#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
>>> +#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
>>> +
>>>  struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
>>>  struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned 
>> flags,
>>>  			 int sysfile_type);
>>>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-03  8:20         ` Junxiao Bi
  0 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  8:20 UTC (permalink / raw)
  To: Gang He, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

Hi Gang,

On 11/03/2015 03:54 PM, Gang He wrote:
> Hi Junxiao,
> 
> Thank for your reviewing.
> Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
> But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
> Why?
> 1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
If user don't want this, they should not use error=continue option, let
fs go after a corruption is very dangerous.
> 2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
I think if this feature could bring more corruption, then this should be
fixed first.

Thanks,
Junxiao
> 3) in the future, if this feature is well proved, we can add a mount option to make this automatically fix enabled.
> 
> 
> Thanks
> Gang
>    
> 
> 
>>>>
>> Hi Gang,
>>
>> I didn't see a need to add a sysfs file for the check and repair. This
>> leaves a hard problem for customer to decide. How they decide whether
>> they should repair the bad inode since this may cause corruption even
>> harder?
>> I think the error should be fixed by this feature automaticlly if repair
>> helps, of course this can be done only when error=continue is enabled or
>> add some mount option for it.
>>
>> Thanks,
>> Junxiao.
>>
>> On 10/28/2015 02:25 PM, Gang He wrote:
>>> Implement online file check sysfile interfaces, e.g.
>>> how to create the related sysfile according to device name,
>>> how to display/handle file check request from the sysfile.
>>>
>>> Signed-off-by: Gang He <ghe@suse.com>
>>> ---
>>>  fs/ocfs2/Makefile    |   3 +-
>>>  fs/ocfs2/filecheck.c | 566 
>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>  fs/ocfs2/filecheck.h |  48 +++++
>>>  fs/ocfs2/inode.h     |   3 +
>>>  4 files changed, 619 insertions(+), 1 deletion(-)
>>>  create mode 100644 fs/ocfs2/filecheck.c
>>>  create mode 100644 fs/ocfs2/filecheck.h
>>>
>>> diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
>>> index ce210d4..e27e652 100644
>>> --- a/fs/ocfs2/Makefile
>>> +++ b/fs/ocfs2/Makefile
>>> @@ -41,7 +41,8 @@ ocfs2-objs := \
>>>  	quota_local.o		\
>>>  	quota_global.o		\
>>>  	xattr.o			\
>>> -	acl.o
>>> +	acl.o	\
>>> +	filecheck.o
>>>  
>>>  ocfs2_stackglue-objs := stackglue.o
>>>  ocfs2_stack_o2cb-objs := stack_o2cb.o
>>> diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
>>> new file mode 100644
>>> index 0000000..f12ed1f
>>> --- /dev/null
>>> +++ b/fs/ocfs2/filecheck.c
>>> @@ -0,0 +1,566 @@
>>> +/* -*- mode: c; c-basic-offset: 8; -*-
>>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>>> + *
>>> + * filecheck.c
>>> + *
>>> + * Code which implements online file check.
>>> + *
>>> + * Copyright (C) 2015 Novell.  All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU General Public
>>> + * License as published by the Free Software Foundation, version 2.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> + * General Public License for more details.
>>> + */
>>> +
>>> +#include <linux/list.h>
>>> +#include <linux/spinlock.h>
>>> +#include <linux/module.h>
>>> +#include <linux/slab.h>
>>> +#include <linux/kmod.h>
>>> +#include <linux/fs.h>
>>> +#include <linux/kobject.h>
>>> +#include <linux/sysfs.h>
>>> +#include <linux/sysctl.h>
>>> +#include <cluster/masklog.h>
>>> +
>>> +#include "ocfs2.h"
>>> +#include "ocfs2_fs.h"
>>> +#include "stackglue.h"
>>> +#include "inode.h"
>>> +
>>> +#include "filecheck.h"
>>> +
>>> +
>>> +/* File check error strings,
>>> + * must correspond with error number in header file.
>>> + */
>>> +static const char * const ocfs2_filecheck_errs[] = {
>>> +	"SUCCESS",
>>> +	"FAILED",
>>> +	"INPROGRESS",
>>> +	"READONLY",
>>> +	"INVALIDINO",
>>> +	"BLOCKECC",
>>> +	"BLOCKNO",
>>> +	"VALIDFLAG",
>>> +	"GENERATION",
>>> +	"UNSUPPORTED"
>>> +};
>>> +
>>> +static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
>>> +static LIST_HEAD(ocfs2_filecheck_sysfs_list);
>>> +
>>> +struct ocfs2_filecheck {
>>> +	struct list_head fc_head;	/* File check entry list head */
>>> +	spinlock_t fc_lock;
>>> +	unsigned int fc_max;	/* Maximum number of entry in list */
>>> +	unsigned int fc_size;	/* Current entry count in list */
>>> +	unsigned int fc_done;	/* File check entries are done in list */
>>> +};
>>> +
>>> +struct ocfs2_filecheck_sysfs_entry {
>>> +	struct list_head fs_list;
>>> +	atomic_t fs_count;
>>> +	struct super_block *fs_sb;
>>> +	struct kset *fs_kset;
>>> +	struct ocfs2_filecheck *fs_fcheck;
>>> +};
>>> +
>>> +#define OCFS2_FILECHECK_MAXSIZE		100
>>> +#define OCFS2_FILECHECK_MINSIZE		10
>>> +
>>> +/* File check operation type */
>>> +enum {
>>> +	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
>>> +	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
>>> +	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
>>> +};
>>> +
>>> +struct ocfs2_filecheck_entry {
>>> +	struct list_head fe_list;
>>> +	unsigned long fe_ino;
>>> +	unsigned int fe_type;
>>> +	unsigned short fe_done:1;
>>> +	unsigned short fe_status:15;
>>> +};
>>> +
>>> +struct ocfs2_filecheck_args {
>>> +	unsigned int fa_type;
>>> +	union {
>>> +		unsigned long fa_ino;
>>> +		unsigned int fa_len;
>>> +	};
>>> +};
>>> +
>>> +static const char *
>>> +ocfs2_filecheck_error(int errno)
>>> +{
>>> +	if (!errno)
>>> +		return ocfs2_filecheck_errs[errno];
>>> +
>>> +	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
>>> +			errno > OCFS2_FILECHECK_ERR_END);
>>> +	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
>>> +}
>>> +
>>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>>> +					struct kobj_attribute *attr,
>>> +					char *buf);
>>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>>> +					struct kobj_attribute *attr,
>>> +					const char *buf, size_t count);
>>> +static struct kobj_attribute ocfs2_attr_filecheck =
>>> +					__ATTR(filecheck, S_IRUSR | S_IWUSR,
>>> +					ocfs2_filecheck_show,
>>> +					ocfs2_filecheck_store);
>>> +
>>> +static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
>>> +{
>>> +	schedule();
>>> +	return 0;
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
>>> +{
>>> +	struct ocfs2_filecheck_entry *p;
>>> +
>>> +	if (!atomic_dec_and_test(&entry->fs_count))
>>> +		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
>>> +						TASK_UNINTERRUPTIBLE);
>>> +
>>> +	spin_lock(&entry->fs_fcheck->fc_lock);
>>> +	while (!list_empty(&entry->fs_fcheck->fc_head)) {
>>> +		p = list_first_entry(&entry->fs_fcheck->fc_head,
>>> +				struct ocfs2_filecheck_entry, fe_list);
>>> +		list_del(&p->fe_list);
>>> +		BUG_ON(!p->fe_done); /* To free a undone file check entry */
>>> +		kfree(p);
>>> +	}
>>> +	spin_unlock(&entry->fs_fcheck->fc_lock);
>>> +
>>> +	kset_unregister(entry->fs_kset);
>>> +	kfree(entry->fs_fcheck);
>>> +	kfree(entry);
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
>>> +{
>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>> +	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +}
>>> +
>>> +static int ocfs2_filecheck_sysfs_del(const char *devname)
>>> +{
>>> +	struct ocfs2_filecheck_sysfs_entry *p;
>>> +
>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>>> +			list_del(&p->fs_list);
>>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +			ocfs2_filecheck_sysfs_free(p);
>>> +			return 0;
>>> +		}
>>> +	}
>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +	return 1;
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
>>> +{
>>> +	if (atomic_dec_and_test(&entry->fs_count))
>>> +		wake_up_atomic_t(&entry->fs_count);
>>> +}
>>> +
>>> +static struct ocfs2_filecheck_sysfs_entry *
>>> +ocfs2_filecheck_sysfs_get(const char *devname)
>>> +{
>>> +	struct ocfs2_filecheck_sysfs_entry *p = NULL;
>>> +
>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>>> +			atomic_inc(&p->fs_count);
>>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +			return p;
>>> +		}
>>> +	}
>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>> +	return NULL;
>>> +}
>>> +
>>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb)
>>> +{
>>> +	int ret = 0;
>>> +	struct kset *ocfs2_filecheck_kset = NULL;
>>> +	struct ocfs2_filecheck *fcheck = NULL;
>>> +	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
>>> +	struct attribute **attrs = NULL;
>>> +	struct attribute_group attrgp;
>>> +
>>> +	if (!ocfs2_kset)
>>> +		return -ENOMEM;
>>> +
>>> +	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
>>> +	if (!attrs) {
>>> +		ret = -ENOMEM;
>>> +		goto error;
>>> +	} else {
>>> +		attrs[0] = &ocfs2_attr_filecheck.attr;
>>> +		attrs[1] = NULL;
>>> +		memset(&attrgp, 0, sizeof(attrgp));
>>> +		attrgp.attrs = attrs;
>>> +	}
>>> +
>>> +	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
>>> +	if (!fcheck) {
>>> +		ret = -ENOMEM;
>>> +		goto error;
>>> +	} else {
>>> +		INIT_LIST_HEAD(&fcheck->fc_head);
>>> +		spin_lock_init(&fcheck->fc_lock);
>>> +		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
>>> +		fcheck->fc_size = 0;
>>> +		fcheck->fc_done = 0;
>>> +	}
>>> +
>>> +	if (strlen(sb->s_id) <= 0) {
>>> +		mlog(ML_ERROR,
>>> +		"Cannot get device basename when create filecheck sysfs\n");
>>> +		ret = -ENODEV;
>>> +		goto error;
>>> +	}
>>> +
>>> +	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
>>> +						&ocfs2_kset->kobj);
>>> +	if (!ocfs2_filecheck_kset) {
>>> +		ret = -ENOMEM;
>>> +		goto error;
>>> +	}
>>> +
>>> +	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
>>> +	if (ret)
>>> +		goto error;
>>> +
>>> +	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
>>> +	if (!entry) {
>>> +		ret = -ENOMEM;
>>> +		goto error;
>>> +	} else {
>>> +		atomic_set(&entry->fs_count, 1);
>>> +		entry->fs_sb = sb;
>>> +		entry->fs_kset = ocfs2_filecheck_kset;
>>> +		entry->fs_fcheck = fcheck;
>>> +		ocfs2_filecheck_sysfs_add(entry);
>>> +	}
>>> +
>>> +	kfree(attrs);
>>> +	return 0;
>>> +
>>> +error:
>>> +	kfree(attrs);
>>> +	kfree(entry);
>>> +	kfree(fcheck);
>>> +	kset_unregister(ocfs2_filecheck_kset);
>>> +	return ret;
>>> +}
>>> +
>>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
>>> +{
>>> +	return ocfs2_filecheck_sysfs_del(sb->s_id);
>>> +}
>>> +
>>> +static int
>>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				unsigned int count);
>>> +static int
>>> +ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				unsigned int len)
>>> +{
>>> +	int ret;
>>> +
>>> +	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
>>> +		return -EINVAL;
>>> +
>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>> +	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
>>> +		mlog(ML_ERROR,
>>> +		"Cannot set online file check maximum entry number "
>>> +		"to %u due to too much pending entries(%u)\n",
>>> +		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
>>> +		ret = -EBUSY;
>>> +	} else {
>>> +		if (len < ent->fs_fcheck->fc_size)
>>> +			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
>>> +				ent->fs_fcheck->fc_size - len));
>>> +
>>> +		ent->fs_fcheck->fc_max = len;
>>> +		ret = 0;
>>> +	}
>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +#define OCFS2_FILECHECK_ARGS_LEN	32
>>> +static int
>>> +ocfs2_filecheck_args_get_long(const char *buf, size_t count,
>>> +				unsigned long *val)
>>> +{
>>> +	char buffer[OCFS2_FILECHECK_ARGS_LEN];
>>> +
>>> +	if (count < 1)
>>> +		return 1;
>>> +
>>> +	memcpy(buffer, buf, count);
>>> +	buffer[count] = '\0';
>>> +
>>> +	if (kstrtoul(buffer, 0, val))
>>> +		return 1;
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int
>>> +ocfs2_filecheck_args_parse(const char *buf, size_t count,
>>> +				struct ocfs2_filecheck_args *args)
>>> +{
>>> +	unsigned long val = 0;
>>> +
>>> +	/* too short/long args length */
>>> +	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
>>> +		return 1;
>>> +
>>> +	if (!strncasecmp(buf, "FIX ", 4)) {
>>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>>> +			return 1;
>>> +
>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
>>> +		args->fa_ino = val;
>>> +		return 0;
>>> +	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
>>> +		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
>>> +			return 1;
>>> +
>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
>>> +		args->fa_ino = val;
>>> +		return 0;
>>> +	} else if (!strncasecmp(buf, "SET ", 4)) {
>>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>>> +			return 1;
>>> +
>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
>>> +		args->fa_len = (unsigned int)val;
>>> +		return 0;
>>> +	} else { /* invalid args */
>>> +		return 1;
>>> +	}
>>> +}
>>> +
>>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>>> +					struct kobj_attribute *attr,
>>> +					char *buf)
>>> +{
>>> +
>>> +	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
>>> +	struct ocfs2_filecheck_entry *p;
>>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>>> +
>>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>>> +	if (!ent) {
>>> +		mlog(ML_ERROR,
>>> +		"Cannot get the corresponding entry via device basename %s\n",
>>> +		kobj->name);
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>> +	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
>>> +	total += ret;
>>> +	remain -= ret;
>>> +
>>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>>> +		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
>>> +			p->fe_ino, p->fe_type, p->fe_done,
>>> +			ocfs2_filecheck_error(p->fe_status));
>>> +		if (ret < 0) {
>>> +			total = ret;
>>> +			break;
>>> +		}
>>> +		if (ret == remain) {
>>> +			/* snprintf() didn't fit */
>>> +			total = -E2BIG;
>>> +			break;
>>> +		}
>>> +		total += ret;
>>> +		remain -= ret;
>>> +	}
>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>> +
>>> +	ocfs2_filecheck_sysfs_put(ent);
>>> +	return total;
>>> +}
>>> +
>>> +static int
>>> +ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
>>> +{
>>> +	struct ocfs2_filecheck_entry *p;
>>> +
>>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>>> +		if (p->fe_done) {
>>> +			list_del(&p->fe_list);
>>> +			kfree(p);
>>> +			ent->fs_fcheck->fc_size--;
>>> +			ent->fs_fcheck->fc_done--;
>>> +			return 1;
>>> +		}
>>> +	}
>>> +
>>> +	return 0;
>>> +}
>>> +
>>> +static int
>>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				unsigned int count)
>>> +{
>>> +	unsigned int i = 0;
>>> +	unsigned int ret = 0;
>>> +
>>> +	while (i++ < count) {
>>> +		if (ocfs2_filecheck_erase_entry(ent))
>>> +			ret++;
>>> +		else
>>> +			break;
>>> +	}
>>> +
>>> +	return (ret == count ? 1 : 0);
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				struct ocfs2_filecheck_entry *entry)
>>> +{
>>> +	entry->fe_done = 1;
>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>> +	ent->fs_fcheck->fc_done++;
>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>> +}
>>> +
>>> +static unsigned short
>>> +ocfs2_filecheck_handle(struct super_block *sb,
>>> +				unsigned long ino, unsigned int flags)
>>> +{
>>> +	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
>>> +	struct inode *inode = NULL;
>>> +	int rc;
>>> +
>>> +	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
>>> +	if (IS_ERR(inode)) {
>>> +		rc = (int)(-(long)inode);
>>> +		if (rc >= OCFS2_FILECHECK_ERR_START &&
>>> +			rc < OCFS2_FILECHECK_ERR_END)
>>> +			ret = rc;
>>> +		else
>>> +			ret = OCFS2_FILECHECK_ERR_FAILED;
>>> +	} else
>>> +		iput(inode);
>>> +
>>> +	return ret;
>>> +}
>>> +
>>> +static void
>>> +ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>>> +				struct ocfs2_filecheck_entry *entry)
>>> +{
>>> +	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
>>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
>>> +	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
>>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
>>> +	else
>>> +		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
>>> +
>>> +	ocfs2_filecheck_done_entry(ent, entry);
>>> +}
>>> +
>>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>>> +				struct kobj_attribute *attr,
>>> +				const char *buf, size_t count)
>>> +{
>>> +	struct ocfs2_filecheck_args args;
>>> +	struct ocfs2_filecheck_entry *entry = NULL;
>>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>>> +	ssize_t ret = 0;
>>> +
>>> +	if (count == 0)
>>> +		return count;
>>> +
>>> +	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
>>> +		mlog(ML_ERROR, "Invalid arguments for online file check\n");
>>> +		return -EINVAL;
>>> +	}
>>> +
>>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>>> +	if (!ent) {
>>> +		mlog(ML_ERROR,
>>> +		"Cannot get the corresponding entry via device basename %s\n",
>>> +		kobj->name);
>>> +		return -ENODEV;
>>> +	}
>>> +
>>> +	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
>>> +		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
>>> +		ocfs2_filecheck_sysfs_put(ent);
>>> +		return (!ret ? count : ret);
>>> +	}
>>> +
>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>> +	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>>> +		(ent->fs_fcheck->fc_done == 0)) {
>>> +		mlog(ML_ERROR,
>>> +		"Online file check queue(%u) is full\n",
>>> +		ent->fs_fcheck->fc_max);
>>> +		ret = -EBUSY;
>>> +	} else {
>>> +		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>>> +			(ent->fs_fcheck->fc_done > 0)) {
>>> +			/* Delete the oldest entry which was done,
>>> +			 * make sure the entry size in list does
>>> +			 * not exceed maximum value
>>> +			 */
>>> +			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
>>> +		}
>>> +
>>> +		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
>>> +		if (entry) {
>>> +			entry->fe_ino = args.fa_ino;
>>> +			entry->fe_type = args.fa_type;
>>> +			entry->fe_done = 0;
>>> +			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
>>> +			list_add_tail(&entry->fe_list,
>>> +					&ent->fs_fcheck->fc_head);
>>> +
>>> +			ent->fs_fcheck->fc_size++;
>>> +			ret = count;
>>> +		} else {
>>> +			ret = -ENOMEM;
>>> +		}
>>> +	}
>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>> +
>>> +	if (entry)
>>> +		ocfs2_filecheck_handle_entry(ent, entry);
>>> +
>>> +	ocfs2_filecheck_sysfs_put(ent);
>>> +	return ret;
>>> +}
>>> diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
>>> new file mode 100644
>>> index 0000000..5ec331b
>>> --- /dev/null
>>> +++ b/fs/ocfs2/filecheck.h
>>> @@ -0,0 +1,48 @@
>>> +/* -*- mode: c; c-basic-offset: 8; -*-
>>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>>> + *
>>> + * filecheck.h
>>> + *
>>> + * Online file check.
>>> + *
>>> + * Copyright (C) 2015 Novell.  All rights reserved.
>>> + *
>>> + * This program is free software; you can redistribute it and/or
>>> + * modify it under the terms of the GNU General Public
>>> + * License as published by the Free Software Foundation, version 2.
>>> + *
>>> + * This program is distributed in the hope that it will be useful,
>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> + * General Public License for more details.
>>> + */
>>> +
>>> +
>>> +#ifndef FILECHECK_H
>>> +#define FILECHECK_H
>>> +
>>> +#include <linux/types.h>
>>> +#include <linux/list.h>
>>> +
>>> +
>>> +/* File check errno */
>>> +enum {
>>> +	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
>>> +	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
>>> +	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
>>> +	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
>>> +	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
>>> +	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
>>> +	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
>>> +	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
>>> +	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
>>> +	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
>>> +};
>>> +
>>> +#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
>>> +#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
>>> +
>>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb);
>>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
>>> +
>>> +#endif  /* FILECHECK_H */
>>> diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
>>> index 5e86b24..abd1018 100644
>>> --- a/fs/ocfs2/inode.h
>>> +++ b/fs/ocfs2/inode.h
>>> @@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
>>>  /* Flags for ocfs2_iget() */
>>>  #define OCFS2_FI_FLAG_SYSFILE		0x1
>>>  #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
>>> +#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
>>> +#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
>>> +
>>>  struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
>>>  struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned 
>> flags,
>>>  			 int sysfile_type);
>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-11-03  8:15       ` [Ocfs2-devel] " Gang He
@ 2015-11-03  8:29         ` Junxiao Bi
  -1 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  8:29 UTC (permalink / raw)
  To: Gang He, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

On 11/03/2015 04:15 PM, Gang He wrote:
> Hello Junxiao,
> 
> See my comments inline.
> 
> 
>>>>
>> Hi Gang,
>>
>> This is not like a right patch.
>> First, online file check only checks inode's block number, valid flag,
>> fs generation value, and meta ecc. I never see a real corruption
>> happened only on this field, if these fields are corrupted, that means
>> something bad may happen on other place. So fix this field may not help
>> and even cause corruption more hard.
> This online file check/fix feature is used to check/fix some light file meta block corruption, instead of turning a file system off and using fsck.ocfs2.
What's light meta block corruption? Do you have a case about it?
> e.g. meta ecc error, we really need not to use fsck.ocfs2. 
> of course, this feature does not replace fsck.ocfs2 and touch some complicated meta block problems, if there is some potential problem in some areas, we can discuss them one by one.
> 
> 
> 
>> Second, the repair way is wrong. In
>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>> match the ones in memory, the ones in memory are used to update the disk
>> fields. The question is how do you know these field in memory are
>> right(they may be the real corrupted ones)?
> Here, if the inode block was corrupted, the file system is not able to load it into the memory.
How do you know inode block corrupted? If bh for inode block is
overwritten, i mean bh corrupted, the repair will corrupted a good inode
block.

Thanks,
Junxiao.

> ocfs2_filecheck_repair_inode_block() will able to load it into the memory, since it try to fix these light-level problem before loading.
> if the fix is OK, the changed meta-block can pass the block-validate function and load into the memory as a inode object.
> Since the file system is under a cluster environment, we have to use some existing function and code path to keep these block operation under a cluster lock.
> 
> 
> Thanks
> Gang
> 
>>
>> Thanks,
>> Junxiao.
>> On 10/28/2015 02:26 PM, Gang He wrote:
>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>> +			       struct buffer_head *bh)
>>> +{
>>> +	int rc;
>>> +	int changed = 0;
>>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>> +
>>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>> +	/* Can't fix invalid inode block */
>>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>> +		return rc;
>>> +
>>> +	trace_ocfs2_filecheck_repair_inode_block(
>>> +		(unsigned long long)bh->b_blocknr);
>>> +
>>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>> +			(unsigned long long)bh->b_blocknr);
>>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>>> +	}
>>> +
>>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>> +		changed = 1;
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>> +			(unsigned long long)bh->b_blocknr,
>>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>>> +	}
>>> +
>>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>> +		changed = 1;
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>> +			(unsigned long long)bh->b_blocknr);
>>> +	}
>>> +
>>> +	if (le32_to_cpu(di->i_fs_generation) !=
>>> +	    OCFS2_SB(sb)->fs_generation) {
>>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>> +		changed = 1;
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>> +			(unsigned long long)bh->b_blocknr,
>>> +			le32_to_cpu(di->i_fs_generation));
>>> +	}
>>> +
>>> +	if (changed ||
>>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>> +		mark_buffer_dirty(bh);
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>>> +			(unsigned long long)bh->b_blocknr);
>>> +	}
>>> +
>>> +	return 0;
>>> +}
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-03  8:29         ` Junxiao Bi
  0 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  8:29 UTC (permalink / raw)
  To: Gang He, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

On 11/03/2015 04:15 PM, Gang He wrote:
> Hello Junxiao,
> 
> See my comments inline.
> 
> 
>>>>
>> Hi Gang,
>>
>> This is not like a right patch.
>> First, online file check only checks inode's block number, valid flag,
>> fs generation value, and meta ecc. I never see a real corruption
>> happened only on this field, if these fields are corrupted, that means
>> something bad may happen on other place. So fix this field may not help
>> and even cause corruption more hard.
> This online file check/fix feature is used to check/fix some light file meta block corruption, instead of turning a file system off and using fsck.ocfs2.
What's light meta block corruption? Do you have a case about it?
> e.g. meta ecc error, we really need not to use fsck.ocfs2. 
> of course, this feature does not replace fsck.ocfs2 and touch some complicated meta block problems, if there is some potential problem in some areas, we can discuss them one by one.
> 
> 
> 
>> Second, the repair way is wrong. In
>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>> match the ones in memory, the ones in memory are used to update the disk
>> fields. The question is how do you know these field in memory are
>> right(they may be the real corrupted ones)?
> Here, if the inode block was corrupted, the file system is not able to load it into the memory.
How do you know inode block corrupted? If bh for inode block is
overwritten, i mean bh corrupted, the repair will corrupted a good inode
block.

Thanks,
Junxiao.

> ocfs2_filecheck_repair_inode_block() will able to load it into the memory, since it try to fix these light-level problem before loading.
> if the fix is OK, the changed meta-block can pass the block-validate function and load into the memory as a inode object.
> Since the file system is under a cluster environment, we have to use some existing function and code path to keep these block operation under a cluster lock.
> 
> 
> Thanks
> Gang
> 
>>
>> Thanks,
>> Junxiao.
>> On 10/28/2015 02:26 PM, Gang He wrote:
>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>> +			       struct buffer_head *bh)
>>> +{
>>> +	int rc;
>>> +	int changed = 0;
>>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>> +
>>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>> +	/* Can't fix invalid inode block */
>>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>> +		return rc;
>>> +
>>> +	trace_ocfs2_filecheck_repair_inode_block(
>>> +		(unsigned long long)bh->b_blocknr);
>>> +
>>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>> +			(unsigned long long)bh->b_blocknr);
>>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>>> +	}
>>> +
>>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>> +		changed = 1;
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>> +			(unsigned long long)bh->b_blocknr,
>>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>>> +	}
>>> +
>>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>> +		changed = 1;
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>> +			(unsigned long long)bh->b_blocknr);
>>> +	}
>>> +
>>> +	if (le32_to_cpu(di->i_fs_generation) !=
>>> +	    OCFS2_SB(sb)->fs_generation) {
>>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>> +		changed = 1;
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>> +			(unsigned long long)bh->b_blocknr,
>>> +			le32_to_cpu(di->i_fs_generation));
>>> +	}
>>> +
>>> +	if (changed ||
>>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>> +		mark_buffer_dirty(bh);
>>> +		mlog(ML_ERROR,
>>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>>> +			(unsigned long long)bh->b_blocknr);
>>> +	}
>>> +
>>> +	return 0;
>>> +}
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-03  8:20         ` [Ocfs2-devel] " Junxiao Bi
@ 2015-11-03  8:30           ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  8:30 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

Hi Junxiao,


>>> 
> Hi Gang,
> 
> On 11/03/2015 03:54 PM, Gang He wrote:
>> Hi Junxiao,
>> 
>> Thank for your reviewing.
>> Current design, we use a sysfile as a interface to check/fix a file (via 
> pass a ino number).
>> But, this operation is manually triggered by user, instead of automatically  
> fix in the kernel.
>> Why?
>> 1) we should let users make this decision, since some users do not want to 
> fix when encountering a file system corruption, maybe they want to keep the 
> file system unchanged for a further investigation.
> If user don't want this, they should not use error=continue option, let
> fs go after a corruption is very dangerous.
>> 2) frankly speaking, this feature will probably bring a second corruption if 
> there is some error in the code, I do not suggest to use automatically fix by 
> default in the first version.
> I think if this feature could bring more corruption, then this should be
> fixed first.
In theory, this feature will avoid bringing any second corruption after our detailed reviewing and discussion.
but, my means is that if there is any carelessness due to our experience limitation, it will probably bring a accident second corruption.
this is why, I do not suggest to use automatically fix by default in the kernel when a feature is firstly introduced.

> 
> Thanks,
> Junxiao
>> 3) in the future, if this feature is well proved, we can add a mount option 
> to make this automatically fix enabled.
>> 
>> 
>> Thanks
>> Gang
>>    
>> 
>> 
>>>>>
>>> Hi Gang,
>>>
>>> I didn't see a need to add a sysfs file for the check and repair. This
>>> leaves a hard problem for customer to decide. How they decide whether
>>> they should repair the bad inode since this may cause corruption even
>>> harder?
>>> I think the error should be fixed by this feature automaticlly if repair
>>> helps, of course this can be done only when error=continue is enabled or
>>> add some mount option for it.
>>>
>>> Thanks,
>>> Junxiao.
>>>
>>> On 10/28/2015 02:25 PM, Gang He wrote:
>>>> Implement online file check sysfile interfaces, e.g.
>>>> how to create the related sysfile according to device name,
>>>> how to display/handle file check request from the sysfile.
>>>>
>>>> Signed-off-by: Gang He <ghe@suse.com>
>>>> ---
>>>>  fs/ocfs2/Makefile    |   3 +-
>>>>  fs/ocfs2/filecheck.c | 566 
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  fs/ocfs2/filecheck.h |  48 +++++
>>>>  fs/ocfs2/inode.h     |   3 +
>>>>  4 files changed, 619 insertions(+), 1 deletion(-)
>>>>  create mode 100644 fs/ocfs2/filecheck.c
>>>>  create mode 100644 fs/ocfs2/filecheck.h
>>>>
>>>> diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
>>>> index ce210d4..e27e652 100644
>>>> --- a/fs/ocfs2/Makefile
>>>> +++ b/fs/ocfs2/Makefile
>>>> @@ -41,7 +41,8 @@ ocfs2-objs := \
>>>>  	quota_local.o		\
>>>>  	quota_global.o		\
>>>>  	xattr.o			\
>>>> -	acl.o
>>>> +	acl.o	\
>>>> +	filecheck.o
>>>>  
>>>>  ocfs2_stackglue-objs := stackglue.o
>>>>  ocfs2_stack_o2cb-objs := stack_o2cb.o
>>>> diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
>>>> new file mode 100644
>>>> index 0000000..f12ed1f
>>>> --- /dev/null
>>>> +++ b/fs/ocfs2/filecheck.c
>>>> @@ -0,0 +1,566 @@
>>>> +/* -*- mode: c; c-basic-offset: 8; -*-
>>>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>>>> + *
>>>> + * filecheck.c
>>>> + *
>>>> + * Code which implements online file check.
>>>> + *
>>>> + * Copyright (C) 2015 Novell.  All rights reserved.
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU General Public
>>>> + * License as published by the Free Software Foundation, version 2.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> + * General Public License for more details.
>>>> + */
>>>> +
>>>> +#include <linux/list.h>
>>>> +#include <linux/spinlock.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/slab.h>
>>>> +#include <linux/kmod.h>
>>>> +#include <linux/fs.h>
>>>> +#include <linux/kobject.h>
>>>> +#include <linux/sysfs.h>
>>>> +#include <linux/sysctl.h>
>>>> +#include <cluster/masklog.h>
>>>> +
>>>> +#include "ocfs2.h"
>>>> +#include "ocfs2_fs.h"
>>>> +#include "stackglue.h"
>>>> +#include "inode.h"
>>>> +
>>>> +#include "filecheck.h"
>>>> +
>>>> +
>>>> +/* File check error strings,
>>>> + * must correspond with error number in header file.
>>>> + */
>>>> +static const char * const ocfs2_filecheck_errs[] = {
>>>> +	"SUCCESS",
>>>> +	"FAILED",
>>>> +	"INPROGRESS",
>>>> +	"READONLY",
>>>> +	"INVALIDINO",
>>>> +	"BLOCKECC",
>>>> +	"BLOCKNO",
>>>> +	"VALIDFLAG",
>>>> +	"GENERATION",
>>>> +	"UNSUPPORTED"
>>>> +};
>>>> +
>>>> +static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
>>>> +static LIST_HEAD(ocfs2_filecheck_sysfs_list);
>>>> +
>>>> +struct ocfs2_filecheck {
>>>> +	struct list_head fc_head;	/* File check entry list head */
>>>> +	spinlock_t fc_lock;
>>>> +	unsigned int fc_max;	/* Maximum number of entry in list */
>>>> +	unsigned int fc_size;	/* Current entry count in list */
>>>> +	unsigned int fc_done;	/* File check entries are done in list */
>>>> +};
>>>> +
>>>> +struct ocfs2_filecheck_sysfs_entry {
>>>> +	struct list_head fs_list;
>>>> +	atomic_t fs_count;
>>>> +	struct super_block *fs_sb;
>>>> +	struct kset *fs_kset;
>>>> +	struct ocfs2_filecheck *fs_fcheck;
>>>> +};
>>>> +
>>>> +#define OCFS2_FILECHECK_MAXSIZE		100
>>>> +#define OCFS2_FILECHECK_MINSIZE		10
>>>> +
>>>> +/* File check operation type */
>>>> +enum {
>>>> +	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
>>>> +	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
>>>> +	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
>>>> +};
>>>> +
>>>> +struct ocfs2_filecheck_entry {
>>>> +	struct list_head fe_list;
>>>> +	unsigned long fe_ino;
>>>> +	unsigned int fe_type;
>>>> +	unsigned short fe_done:1;
>>>> +	unsigned short fe_status:15;
>>>> +};
>>>> +
>>>> +struct ocfs2_filecheck_args {
>>>> +	unsigned int fa_type;
>>>> +	union {
>>>> +		unsigned long fa_ino;
>>>> +		unsigned int fa_len;
>>>> +	};
>>>> +};
>>>> +
>>>> +static const char *
>>>> +ocfs2_filecheck_error(int errno)
>>>> +{
>>>> +	if (!errno)
>>>> +		return ocfs2_filecheck_errs[errno];
>>>> +
>>>> +	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
>>>> +			errno > OCFS2_FILECHECK_ERR_END);
>>>> +	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
>>>> +}
>>>> +
>>>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>>>> +					struct kobj_attribute *attr,
>>>> +					char *buf);
>>>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>>>> +					struct kobj_attribute *attr,
>>>> +					const char *buf, size_t count);
>>>> +static struct kobj_attribute ocfs2_attr_filecheck =
>>>> +					__ATTR(filecheck, S_IRUSR | S_IWUSR,
>>>> +					ocfs2_filecheck_show,
>>>> +					ocfs2_filecheck_store);
>>>> +
>>>> +static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
>>>> +{
>>>> +	schedule();
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
>>>> +{
>>>> +	struct ocfs2_filecheck_entry *p;
>>>> +
>>>> +	if (!atomic_dec_and_test(&entry->fs_count))
>>>> +		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
>>>> +						TASK_UNINTERRUPTIBLE);
>>>> +
>>>> +	spin_lock(&entry->fs_fcheck->fc_lock);
>>>> +	while (!list_empty(&entry->fs_fcheck->fc_head)) {
>>>> +		p = list_first_entry(&entry->fs_fcheck->fc_head,
>>>> +				struct ocfs2_filecheck_entry, fe_list);
>>>> +		list_del(&p->fe_list);
>>>> +		BUG_ON(!p->fe_done); /* To free a undone file check entry */
>>>> +		kfree(p);
>>>> +	}
>>>> +	spin_unlock(&entry->fs_fcheck->fc_lock);
>>>> +
>>>> +	kset_unregister(entry->fs_kset);
>>>> +	kfree(entry->fs_fcheck);
>>>> +	kfree(entry);
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
>>>> +{
>>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>>> +	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
>>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +}
>>>> +
>>>> +static int ocfs2_filecheck_sysfs_del(const char *devname)
>>>> +{
>>>> +	struct ocfs2_filecheck_sysfs_entry *p;
>>>> +
>>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>>>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>>>> +			list_del(&p->fs_list);
>>>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +			ocfs2_filecheck_sysfs_free(p);
>>>> +			return 0;
>>>> +		}
>>>> +	}
>>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +	return 1;
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
>>>> +{
>>>> +	if (atomic_dec_and_test(&entry->fs_count))
>>>> +		wake_up_atomic_t(&entry->fs_count);
>>>> +}
>>>> +
>>>> +static struct ocfs2_filecheck_sysfs_entry *
>>>> +ocfs2_filecheck_sysfs_get(const char *devname)
>>>> +{
>>>> +	struct ocfs2_filecheck_sysfs_entry *p = NULL;
>>>> +
>>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>>>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>>>> +			atomic_inc(&p->fs_count);
>>>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +			return p;
>>>> +		}
>>>> +	}
>>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +	return NULL;
>>>> +}
>>>> +
>>>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb)
>>>> +{
>>>> +	int ret = 0;
>>>> +	struct kset *ocfs2_filecheck_kset = NULL;
>>>> +	struct ocfs2_filecheck *fcheck = NULL;
>>>> +	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
>>>> +	struct attribute **attrs = NULL;
>>>> +	struct attribute_group attrgp;
>>>> +
>>>> +	if (!ocfs2_kset)
>>>> +		return -ENOMEM;
>>>> +
>>>> +	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
>>>> +	if (!attrs) {
>>>> +		ret = -ENOMEM;
>>>> +		goto error;
>>>> +	} else {
>>>> +		attrs[0] = &ocfs2_attr_filecheck.attr;
>>>> +		attrs[1] = NULL;
>>>> +		memset(&attrgp, 0, sizeof(attrgp));
>>>> +		attrgp.attrs = attrs;
>>>> +	}
>>>> +
>>>> +	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
>>>> +	if (!fcheck) {
>>>> +		ret = -ENOMEM;
>>>> +		goto error;
>>>> +	} else {
>>>> +		INIT_LIST_HEAD(&fcheck->fc_head);
>>>> +		spin_lock_init(&fcheck->fc_lock);
>>>> +		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
>>>> +		fcheck->fc_size = 0;
>>>> +		fcheck->fc_done = 0;
>>>> +	}
>>>> +
>>>> +	if (strlen(sb->s_id) <= 0) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Cannot get device basename when create filecheck sysfs\n");
>>>> +		ret = -ENODEV;
>>>> +		goto error;
>>>> +	}
>>>> +
>>>> +	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
>>>> +						&ocfs2_kset->kobj);
>>>> +	if (!ocfs2_filecheck_kset) {
>>>> +		ret = -ENOMEM;
>>>> +		goto error;
>>>> +	}
>>>> +
>>>> +	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
>>>> +	if (ret)
>>>> +		goto error;
>>>> +
>>>> +	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
>>>> +	if (!entry) {
>>>> +		ret = -ENOMEM;
>>>> +		goto error;
>>>> +	} else {
>>>> +		atomic_set(&entry->fs_count, 1);
>>>> +		entry->fs_sb = sb;
>>>> +		entry->fs_kset = ocfs2_filecheck_kset;
>>>> +		entry->fs_fcheck = fcheck;
>>>> +		ocfs2_filecheck_sysfs_add(entry);
>>>> +	}
>>>> +
>>>> +	kfree(attrs);
>>>> +	return 0;
>>>> +
>>>> +error:
>>>> +	kfree(attrs);
>>>> +	kfree(entry);
>>>> +	kfree(fcheck);
>>>> +	kset_unregister(ocfs2_filecheck_kset);
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
>>>> +{
>>>> +	return ocfs2_filecheck_sysfs_del(sb->s_id);
>>>> +}
>>>> +
>>>> +static int
>>>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				unsigned int count);
>>>> +static int
>>>> +ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				unsigned int len)
>>>> +{
>>>> +	int ret;
>>>> +
>>>> +	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
>>>> +		return -EINVAL;
>>>> +
>>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>>> +	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Cannot set online file check maximum entry number "
>>>> +		"to %u due to too much pending entries(%u)\n",
>>>> +		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
>>>> +		ret = -EBUSY;
>>>> +	} else {
>>>> +		if (len < ent->fs_fcheck->fc_size)
>>>> +			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
>>>> +				ent->fs_fcheck->fc_size - len));
>>>> +
>>>> +		ent->fs_fcheck->fc_max = len;
>>>> +		ret = 0;
>>>> +	}
>>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +#define OCFS2_FILECHECK_ARGS_LEN	32
>>>> +static int
>>>> +ocfs2_filecheck_args_get_long(const char *buf, size_t count,
>>>> +				unsigned long *val)
>>>> +{
>>>> +	char buffer[OCFS2_FILECHECK_ARGS_LEN];
>>>> +
>>>> +	if (count < 1)
>>>> +		return 1;
>>>> +
>>>> +	memcpy(buffer, buf, count);
>>>> +	buffer[count] = '\0';
>>>> +
>>>> +	if (kstrtoul(buffer, 0, val))
>>>> +		return 1;
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +ocfs2_filecheck_args_parse(const char *buf, size_t count,
>>>> +				struct ocfs2_filecheck_args *args)
>>>> +{
>>>> +	unsigned long val = 0;
>>>> +
>>>> +	/* too short/long args length */
>>>> +	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
>>>> +		return 1;
>>>> +
>>>> +	if (!strncasecmp(buf, "FIX ", 4)) {
>>>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>>>> +			return 1;
>>>> +
>>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
>>>> +		args->fa_ino = val;
>>>> +		return 0;
>>>> +	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
>>>> +		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
>>>> +			return 1;
>>>> +
>>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
>>>> +		args->fa_ino = val;
>>>> +		return 0;
>>>> +	} else if (!strncasecmp(buf, "SET ", 4)) {
>>>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>>>> +			return 1;
>>>> +
>>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
>>>> +		args->fa_len = (unsigned int)val;
>>>> +		return 0;
>>>> +	} else { /* invalid args */
>>>> +		return 1;
>>>> +	}
>>>> +}
>>>> +
>>>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>>>> +					struct kobj_attribute *attr,
>>>> +					char *buf)
>>>> +{
>>>> +
>>>> +	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
>>>> +	struct ocfs2_filecheck_entry *p;
>>>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>>>> +
>>>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>>>> +	if (!ent) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Cannot get the corresponding entry via device basename %s\n",
>>>> +		kobj->name);
>>>> +		return -ENODEV;
>>>> +	}
>>>> +
>>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>>> +	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
>>>> +	total += ret;
>>>> +	remain -= ret;
>>>> +
>>>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>>>> +		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
>>>> +			p->fe_ino, p->fe_type, p->fe_done,
>>>> +			ocfs2_filecheck_error(p->fe_status));
>>>> +		if (ret < 0) {
>>>> +			total = ret;
>>>> +			break;
>>>> +		}
>>>> +		if (ret == remain) {
>>>> +			/* snprintf() didn't fit */
>>>> +			total = -E2BIG;
>>>> +			break;
>>>> +		}
>>>> +		total += ret;
>>>> +		remain -= ret;
>>>> +	}
>>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>>> +
>>>> +	ocfs2_filecheck_sysfs_put(ent);
>>>> +	return total;
>>>> +}
>>>> +
>>>> +static int
>>>> +ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
>>>> +{
>>>> +	struct ocfs2_filecheck_entry *p;
>>>> +
>>>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>>>> +		if (p->fe_done) {
>>>> +			list_del(&p->fe_list);
>>>> +			kfree(p);
>>>> +			ent->fs_fcheck->fc_size--;
>>>> +			ent->fs_fcheck->fc_done--;
>>>> +			return 1;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				unsigned int count)
>>>> +{
>>>> +	unsigned int i = 0;
>>>> +	unsigned int ret = 0;
>>>> +
>>>> +	while (i++ < count) {
>>>> +		if (ocfs2_filecheck_erase_entry(ent))
>>>> +			ret++;
>>>> +		else
>>>> +			break;
>>>> +	}
>>>> +
>>>> +	return (ret == count ? 1 : 0);
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				struct ocfs2_filecheck_entry *entry)
>>>> +{
>>>> +	entry->fe_done = 1;
>>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>>> +	ent->fs_fcheck->fc_done++;
>>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>>> +}
>>>> +
>>>> +static unsigned short
>>>> +ocfs2_filecheck_handle(struct super_block *sb,
>>>> +				unsigned long ino, unsigned int flags)
>>>> +{
>>>> +	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
>>>> +	struct inode *inode = NULL;
>>>> +	int rc;
>>>> +
>>>> +	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
>>>> +	if (IS_ERR(inode)) {
>>>> +		rc = (int)(-(long)inode);
>>>> +		if (rc >= OCFS2_FILECHECK_ERR_START &&
>>>> +			rc < OCFS2_FILECHECK_ERR_END)
>>>> +			ret = rc;
>>>> +		else
>>>> +			ret = OCFS2_FILECHECK_ERR_FAILED;
>>>> +	} else
>>>> +		iput(inode);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				struct ocfs2_filecheck_entry *entry)
>>>> +{
>>>> +	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
>>>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>>>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
>>>> +	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
>>>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>>>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
>>>> +	else
>>>> +		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
>>>> +
>>>> +	ocfs2_filecheck_done_entry(ent, entry);
>>>> +}
>>>> +
>>>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>>>> +				struct kobj_attribute *attr,
>>>> +				const char *buf, size_t count)
>>>> +{
>>>> +	struct ocfs2_filecheck_args args;
>>>> +	struct ocfs2_filecheck_entry *entry = NULL;
>>>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>>>> +	ssize_t ret = 0;
>>>> +
>>>> +	if (count == 0)
>>>> +		return count;
>>>> +
>>>> +	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
>>>> +		mlog(ML_ERROR, "Invalid arguments for online file check\n");
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>>>> +	if (!ent) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Cannot get the corresponding entry via device basename %s\n",
>>>> +		kobj->name);
>>>> +		return -ENODEV;
>>>> +	}
>>>> +
>>>> +	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
>>>> +		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
>>>> +		ocfs2_filecheck_sysfs_put(ent);
>>>> +		return (!ret ? count : ret);
>>>> +	}
>>>> +
>>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>>> +	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>>>> +		(ent->fs_fcheck->fc_done == 0)) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Online file check queue(%u) is full\n",
>>>> +		ent->fs_fcheck->fc_max);
>>>> +		ret = -EBUSY;
>>>> +	} else {
>>>> +		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>>>> +			(ent->fs_fcheck->fc_done > 0)) {
>>>> +			/* Delete the oldest entry which was done,
>>>> +			 * make sure the entry size in list does
>>>> +			 * not exceed maximum value
>>>> +			 */
>>>> +			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
>>>> +		}
>>>> +
>>>> +		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
>>>> +		if (entry) {
>>>> +			entry->fe_ino = args.fa_ino;
>>>> +			entry->fe_type = args.fa_type;
>>>> +			entry->fe_done = 0;
>>>> +			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
>>>> +			list_add_tail(&entry->fe_list,
>>>> +					&ent->fs_fcheck->fc_head);
>>>> +
>>>> +			ent->fs_fcheck->fc_size++;
>>>> +			ret = count;
>>>> +		} else {
>>>> +			ret = -ENOMEM;
>>>> +		}
>>>> +	}
>>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>>> +
>>>> +	if (entry)
>>>> +		ocfs2_filecheck_handle_entry(ent, entry);
>>>> +
>>>> +	ocfs2_filecheck_sysfs_put(ent);
>>>> +	return ret;
>>>> +}
>>>> diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
>>>> new file mode 100644
>>>> index 0000000..5ec331b
>>>> --- /dev/null
>>>> +++ b/fs/ocfs2/filecheck.h
>>>> @@ -0,0 +1,48 @@
>>>> +/* -*- mode: c; c-basic-offset: 8; -*-
>>>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>>>> + *
>>>> + * filecheck.h
>>>> + *
>>>> + * Online file check.
>>>> + *
>>>> + * Copyright (C) 2015 Novell.  All rights reserved.
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU General Public
>>>> + * License as published by the Free Software Foundation, version 2.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> + * General Public License for more details.
>>>> + */
>>>> +
>>>> +
>>>> +#ifndef FILECHECK_H
>>>> +#define FILECHECK_H
>>>> +
>>>> +#include <linux/types.h>
>>>> +#include <linux/list.h>
>>>> +
>>>> +
>>>> +/* File check errno */
>>>> +enum {
>>>> +	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
>>>> +	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
>>>> +	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
>>>> +	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
>>>> +	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
>>>> +	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
>>>> +	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
>>>> +	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
>>>> +	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
>>>> +	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
>>>> +};
>>>> +
>>>> +#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
>>>> +#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
>>>> +
>>>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb);
>>>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
>>>> +
>>>> +#endif  /* FILECHECK_H */
>>>> diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
>>>> index 5e86b24..abd1018 100644
>>>> --- a/fs/ocfs2/inode.h
>>>> +++ b/fs/ocfs2/inode.h
>>>> @@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
>>>>  /* Flags for ocfs2_iget() */
>>>>  #define OCFS2_FI_FLAG_SYSFILE		0x1
>>>>  #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
>>>> +#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
>>>> +#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
>>>> +
>>>>  struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
>>>>  struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned 
>>> flags,
>>>>  			 int sysfile_type);
>>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-03  8:30           ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  8:30 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

Hi Junxiao,


>>> 
> Hi Gang,
> 
> On 11/03/2015 03:54 PM, Gang He wrote:
>> Hi Junxiao,
>> 
>> Thank for your reviewing.
>> Current design, we use a sysfile as a interface to check/fix a file (via 
> pass a ino number).
>> But, this operation is manually triggered by user, instead of automatically  
> fix in the kernel.
>> Why?
>> 1) we should let users make this decision, since some users do not want to 
> fix when encountering a file system corruption, maybe they want to keep the 
> file system unchanged for a further investigation.
> If user don't want this, they should not use error=continue option, let
> fs go after a corruption is very dangerous.
>> 2) frankly speaking, this feature will probably bring a second corruption if 
> there is some error in the code, I do not suggest to use automatically fix by 
> default in the first version.
> I think if this feature could bring more corruption, then this should be
> fixed first.
In theory, this feature will avoid bringing any second corruption after our detailed reviewing and discussion.
but, my means is that if there is any carelessness due to our experience limitation, it will probably bring a accident second corruption.
this is why, I do not suggest to use automatically fix by default in the kernel when a feature is firstly introduced.

> 
> Thanks,
> Junxiao
>> 3) in the future, if this feature is well proved, we can add a mount option 
> to make this automatically fix enabled.
>> 
>> 
>> Thanks
>> Gang
>>    
>> 
>> 
>>>>>
>>> Hi Gang,
>>>
>>> I didn't see a need to add a sysfs file for the check and repair. This
>>> leaves a hard problem for customer to decide. How they decide whether
>>> they should repair the bad inode since this may cause corruption even
>>> harder?
>>> I think the error should be fixed by this feature automaticlly if repair
>>> helps, of course this can be done only when error=continue is enabled or
>>> add some mount option for it.
>>>
>>> Thanks,
>>> Junxiao.
>>>
>>> On 10/28/2015 02:25 PM, Gang He wrote:
>>>> Implement online file check sysfile interfaces, e.g.
>>>> how to create the related sysfile according to device name,
>>>> how to display/handle file check request from the sysfile.
>>>>
>>>> Signed-off-by: Gang He <ghe@suse.com>
>>>> ---
>>>>  fs/ocfs2/Makefile    |   3 +-
>>>>  fs/ocfs2/filecheck.c | 566 
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>>  fs/ocfs2/filecheck.h |  48 +++++
>>>>  fs/ocfs2/inode.h     |   3 +
>>>>  4 files changed, 619 insertions(+), 1 deletion(-)
>>>>  create mode 100644 fs/ocfs2/filecheck.c
>>>>  create mode 100644 fs/ocfs2/filecheck.h
>>>>
>>>> diff --git a/fs/ocfs2/Makefile b/fs/ocfs2/Makefile
>>>> index ce210d4..e27e652 100644
>>>> --- a/fs/ocfs2/Makefile
>>>> +++ b/fs/ocfs2/Makefile
>>>> @@ -41,7 +41,8 @@ ocfs2-objs := \
>>>>  	quota_local.o		\
>>>>  	quota_global.o		\
>>>>  	xattr.o			\
>>>> -	acl.o
>>>> +	acl.o	\
>>>> +	filecheck.o
>>>>  
>>>>  ocfs2_stackglue-objs := stackglue.o
>>>>  ocfs2_stack_o2cb-objs := stack_o2cb.o
>>>> diff --git a/fs/ocfs2/filecheck.c b/fs/ocfs2/filecheck.c
>>>> new file mode 100644
>>>> index 0000000..f12ed1f
>>>> --- /dev/null
>>>> +++ b/fs/ocfs2/filecheck.c
>>>> @@ -0,0 +1,566 @@
>>>> +/* -*- mode: c; c-basic-offset: 8; -*-
>>>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>>>> + *
>>>> + * filecheck.c
>>>> + *
>>>> + * Code which implements online file check.
>>>> + *
>>>> + * Copyright (C) 2015 Novell.  All rights reserved.
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU General Public
>>>> + * License as published by the Free Software Foundation, version 2.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> + * General Public License for more details.
>>>> + */
>>>> +
>>>> +#include <linux/list.h>
>>>> +#include <linux/spinlock.h>
>>>> +#include <linux/module.h>
>>>> +#include <linux/slab.h>
>>>> +#include <linux/kmod.h>
>>>> +#include <linux/fs.h>
>>>> +#include <linux/kobject.h>
>>>> +#include <linux/sysfs.h>
>>>> +#include <linux/sysctl.h>
>>>> +#include <cluster/masklog.h>
>>>> +
>>>> +#include "ocfs2.h"
>>>> +#include "ocfs2_fs.h"
>>>> +#include "stackglue.h"
>>>> +#include "inode.h"
>>>> +
>>>> +#include "filecheck.h"
>>>> +
>>>> +
>>>> +/* File check error strings,
>>>> + * must correspond with error number in header file.
>>>> + */
>>>> +static const char * const ocfs2_filecheck_errs[] = {
>>>> +	"SUCCESS",
>>>> +	"FAILED",
>>>> +	"INPROGRESS",
>>>> +	"READONLY",
>>>> +	"INVALIDINO",
>>>> +	"BLOCKECC",
>>>> +	"BLOCKNO",
>>>> +	"VALIDFLAG",
>>>> +	"GENERATION",
>>>> +	"UNSUPPORTED"
>>>> +};
>>>> +
>>>> +static DEFINE_SPINLOCK(ocfs2_filecheck_sysfs_lock);
>>>> +static LIST_HEAD(ocfs2_filecheck_sysfs_list);
>>>> +
>>>> +struct ocfs2_filecheck {
>>>> +	struct list_head fc_head;	/* File check entry list head */
>>>> +	spinlock_t fc_lock;
>>>> +	unsigned int fc_max;	/* Maximum number of entry in list */
>>>> +	unsigned int fc_size;	/* Current entry count in list */
>>>> +	unsigned int fc_done;	/* File check entries are done in list */
>>>> +};
>>>> +
>>>> +struct ocfs2_filecheck_sysfs_entry {
>>>> +	struct list_head fs_list;
>>>> +	atomic_t fs_count;
>>>> +	struct super_block *fs_sb;
>>>> +	struct kset *fs_kset;
>>>> +	struct ocfs2_filecheck *fs_fcheck;
>>>> +};
>>>> +
>>>> +#define OCFS2_FILECHECK_MAXSIZE		100
>>>> +#define OCFS2_FILECHECK_MINSIZE		10
>>>> +
>>>> +/* File check operation type */
>>>> +enum {
>>>> +	OCFS2_FILECHECK_TYPE_CHK = 0,	/* Check a file */
>>>> +	OCFS2_FILECHECK_TYPE_FIX,	/* Fix a file */
>>>> +	OCFS2_FILECHECK_TYPE_SET = 100	/* Set file check options */
>>>> +};
>>>> +
>>>> +struct ocfs2_filecheck_entry {
>>>> +	struct list_head fe_list;
>>>> +	unsigned long fe_ino;
>>>> +	unsigned int fe_type;
>>>> +	unsigned short fe_done:1;
>>>> +	unsigned short fe_status:15;
>>>> +};
>>>> +
>>>> +struct ocfs2_filecheck_args {
>>>> +	unsigned int fa_type;
>>>> +	union {
>>>> +		unsigned long fa_ino;
>>>> +		unsigned int fa_len;
>>>> +	};
>>>> +};
>>>> +
>>>> +static const char *
>>>> +ocfs2_filecheck_error(int errno)
>>>> +{
>>>> +	if (!errno)
>>>> +		return ocfs2_filecheck_errs[errno];
>>>> +
>>>> +	BUG_ON(errno < OCFS2_FILECHECK_ERR_START ||
>>>> +			errno > OCFS2_FILECHECK_ERR_END);
>>>> +	return ocfs2_filecheck_errs[errno - OCFS2_FILECHECK_ERR_START + 1];
>>>> +}
>>>> +
>>>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>>>> +					struct kobj_attribute *attr,
>>>> +					char *buf);
>>>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>>>> +					struct kobj_attribute *attr,
>>>> +					const char *buf, size_t count);
>>>> +static struct kobj_attribute ocfs2_attr_filecheck =
>>>> +					__ATTR(filecheck, S_IRUSR | S_IWUSR,
>>>> +					ocfs2_filecheck_show,
>>>> +					ocfs2_filecheck_store);
>>>> +
>>>> +static int ocfs2_filecheck_sysfs_wait(atomic_t *p)
>>>> +{
>>>> +	schedule();
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_sysfs_free(struct ocfs2_filecheck_sysfs_entry *entry)
>>>> +{
>>>> +	struct ocfs2_filecheck_entry *p;
>>>> +
>>>> +	if (!atomic_dec_and_test(&entry->fs_count))
>>>> +		wait_on_atomic_t(&entry->fs_count, ocfs2_filecheck_sysfs_wait,
>>>> +						TASK_UNINTERRUPTIBLE);
>>>> +
>>>> +	spin_lock(&entry->fs_fcheck->fc_lock);
>>>> +	while (!list_empty(&entry->fs_fcheck->fc_head)) {
>>>> +		p = list_first_entry(&entry->fs_fcheck->fc_head,
>>>> +				struct ocfs2_filecheck_entry, fe_list);
>>>> +		list_del(&p->fe_list);
>>>> +		BUG_ON(!p->fe_done); /* To free a undone file check entry */
>>>> +		kfree(p);
>>>> +	}
>>>> +	spin_unlock(&entry->fs_fcheck->fc_lock);
>>>> +
>>>> +	kset_unregister(entry->fs_kset);
>>>> +	kfree(entry->fs_fcheck);
>>>> +	kfree(entry);
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_sysfs_add(struct ocfs2_filecheck_sysfs_entry *entry)
>>>> +{
>>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>>> +	list_add_tail(&entry->fs_list, &ocfs2_filecheck_sysfs_list);
>>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +}
>>>> +
>>>> +static int ocfs2_filecheck_sysfs_del(const char *devname)
>>>> +{
>>>> +	struct ocfs2_filecheck_sysfs_entry *p;
>>>> +
>>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>>>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>>>> +			list_del(&p->fs_list);
>>>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +			ocfs2_filecheck_sysfs_free(p);
>>>> +			return 0;
>>>> +		}
>>>> +	}
>>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +	return 1;
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_sysfs_put(struct ocfs2_filecheck_sysfs_entry *entry)
>>>> +{
>>>> +	if (atomic_dec_and_test(&entry->fs_count))
>>>> +		wake_up_atomic_t(&entry->fs_count);
>>>> +}
>>>> +
>>>> +static struct ocfs2_filecheck_sysfs_entry *
>>>> +ocfs2_filecheck_sysfs_get(const char *devname)
>>>> +{
>>>> +	struct ocfs2_filecheck_sysfs_entry *p = NULL;
>>>> +
>>>> +	spin_lock(&ocfs2_filecheck_sysfs_lock);
>>>> +	list_for_each_entry(p, &ocfs2_filecheck_sysfs_list, fs_list) {
>>>> +		if (!strcmp(p->fs_sb->s_id, devname)) {
>>>> +			atomic_inc(&p->fs_count);
>>>> +			spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +			return p;
>>>> +		}
>>>> +	}
>>>> +	spin_unlock(&ocfs2_filecheck_sysfs_lock);
>>>> +	return NULL;
>>>> +}
>>>> +
>>>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb)
>>>> +{
>>>> +	int ret = 0;
>>>> +	struct kset *ocfs2_filecheck_kset = NULL;
>>>> +	struct ocfs2_filecheck *fcheck = NULL;
>>>> +	struct ocfs2_filecheck_sysfs_entry *entry = NULL;
>>>> +	struct attribute **attrs = NULL;
>>>> +	struct attribute_group attrgp;
>>>> +
>>>> +	if (!ocfs2_kset)
>>>> +		return -ENOMEM;
>>>> +
>>>> +	attrs = kmalloc(sizeof(struct attribute *) * 2, GFP_NOFS);
>>>> +	if (!attrs) {
>>>> +		ret = -ENOMEM;
>>>> +		goto error;
>>>> +	} else {
>>>> +		attrs[0] = &ocfs2_attr_filecheck.attr;
>>>> +		attrs[1] = NULL;
>>>> +		memset(&attrgp, 0, sizeof(attrgp));
>>>> +		attrgp.attrs = attrs;
>>>> +	}
>>>> +
>>>> +	fcheck = kmalloc(sizeof(struct ocfs2_filecheck), GFP_NOFS);
>>>> +	if (!fcheck) {
>>>> +		ret = -ENOMEM;
>>>> +		goto error;
>>>> +	} else {
>>>> +		INIT_LIST_HEAD(&fcheck->fc_head);
>>>> +		spin_lock_init(&fcheck->fc_lock);
>>>> +		fcheck->fc_max = OCFS2_FILECHECK_MINSIZE;
>>>> +		fcheck->fc_size = 0;
>>>> +		fcheck->fc_done = 0;
>>>> +	}
>>>> +
>>>> +	if (strlen(sb->s_id) <= 0) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Cannot get device basename when create filecheck sysfs\n");
>>>> +		ret = -ENODEV;
>>>> +		goto error;
>>>> +	}
>>>> +
>>>> +	ocfs2_filecheck_kset = kset_create_and_add(sb->s_id, NULL,
>>>> +						&ocfs2_kset->kobj);
>>>> +	if (!ocfs2_filecheck_kset) {
>>>> +		ret = -ENOMEM;
>>>> +		goto error;
>>>> +	}
>>>> +
>>>> +	ret = sysfs_create_group(&ocfs2_filecheck_kset->kobj, &attrgp);
>>>> +	if (ret)
>>>> +		goto error;
>>>> +
>>>> +	entry = kmalloc(sizeof(struct ocfs2_filecheck_sysfs_entry), GFP_NOFS);
>>>> +	if (!entry) {
>>>> +		ret = -ENOMEM;
>>>> +		goto error;
>>>> +	} else {
>>>> +		atomic_set(&entry->fs_count, 1);
>>>> +		entry->fs_sb = sb;
>>>> +		entry->fs_kset = ocfs2_filecheck_kset;
>>>> +		entry->fs_fcheck = fcheck;
>>>> +		ocfs2_filecheck_sysfs_add(entry);
>>>> +	}
>>>> +
>>>> +	kfree(attrs);
>>>> +	return 0;
>>>> +
>>>> +error:
>>>> +	kfree(attrs);
>>>> +	kfree(entry);
>>>> +	kfree(fcheck);
>>>> +	kset_unregister(ocfs2_filecheck_kset);
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb)
>>>> +{
>>>> +	return ocfs2_filecheck_sysfs_del(sb->s_id);
>>>> +}
>>>> +
>>>> +static int
>>>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				unsigned int count);
>>>> +static int
>>>> +ocfs2_filecheck_adjust_max(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				unsigned int len)
>>>> +{
>>>> +	int ret;
>>>> +
>>>> +	if ((len < OCFS2_FILECHECK_MINSIZE) || (len > OCFS2_FILECHECK_MAXSIZE))
>>>> +		return -EINVAL;
>>>> +
>>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>>> +	if (len < (ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done)) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Cannot set online file check maximum entry number "
>>>> +		"to %u due to too much pending entries(%u)\n",
>>>> +		len, ent->fs_fcheck->fc_size - ent->fs_fcheck->fc_done);
>>>> +		ret = -EBUSY;
>>>> +	} else {
>>>> +		if (len < ent->fs_fcheck->fc_size)
>>>> +			BUG_ON(!ocfs2_filecheck_erase_entries(ent,
>>>> +				ent->fs_fcheck->fc_size - len));
>>>> +
>>>> +		ent->fs_fcheck->fc_max = len;
>>>> +		ret = 0;
>>>> +	}
>>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +#define OCFS2_FILECHECK_ARGS_LEN	32
>>>> +static int
>>>> +ocfs2_filecheck_args_get_long(const char *buf, size_t count,
>>>> +				unsigned long *val)
>>>> +{
>>>> +	char buffer[OCFS2_FILECHECK_ARGS_LEN];
>>>> +
>>>> +	if (count < 1)
>>>> +		return 1;
>>>> +
>>>> +	memcpy(buffer, buf, count);
>>>> +	buffer[count] = '\0';
>>>> +
>>>> +	if (kstrtoul(buffer, 0, val))
>>>> +		return 1;
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +ocfs2_filecheck_args_parse(const char *buf, size_t count,
>>>> +				struct ocfs2_filecheck_args *args)
>>>> +{
>>>> +	unsigned long val = 0;
>>>> +
>>>> +	/* too short/long args length */
>>>> +	if ((count < 5) || (count > OCFS2_FILECHECK_ARGS_LEN))
>>>> +		return 1;
>>>> +
>>>> +	if (!strncasecmp(buf, "FIX ", 4)) {
>>>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>>>> +			return 1;
>>>> +
>>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_FIX;
>>>> +		args->fa_ino = val;
>>>> +		return 0;
>>>> +	} else if ((count > 6) && !strncasecmp(buf, "CHECK ", 6)) {
>>>> +		if (ocfs2_filecheck_args_get_long(buf + 6, count - 6, &val))
>>>> +			return 1;
>>>> +
>>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_CHK;
>>>> +		args->fa_ino = val;
>>>> +		return 0;
>>>> +	} else if (!strncasecmp(buf, "SET ", 4)) {
>>>> +		if (ocfs2_filecheck_args_get_long(buf + 4, count - 4, &val))
>>>> +			return 1;
>>>> +
>>>> +		args->fa_type = OCFS2_FILECHECK_TYPE_SET;
>>>> +		args->fa_len = (unsigned int)val;
>>>> +		return 0;
>>>> +	} else { /* invalid args */
>>>> +		return 1;
>>>> +	}
>>>> +}
>>>> +
>>>> +static ssize_t ocfs2_filecheck_show(struct kobject *kobj,
>>>> +					struct kobj_attribute *attr,
>>>> +					char *buf)
>>>> +{
>>>> +
>>>> +	ssize_t ret = 0, total = 0, remain = PAGE_SIZE;
>>>> +	struct ocfs2_filecheck_entry *p;
>>>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>>>> +
>>>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>>>> +	if (!ent) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Cannot get the corresponding entry via device basename %s\n",
>>>> +		kobj->name);
>>>> +		return -ENODEV;
>>>> +	}
>>>> +
>>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>>> +	ret = snprintf(buf, remain, "INO\t\tTYPE\tDONE\tERROR\n");
>>>> +	total += ret;
>>>> +	remain -= ret;
>>>> +
>>>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>>>> +		ret = snprintf(buf + total, remain, "%lu\t\t%u\t%u\t%s\n",
>>>> +			p->fe_ino, p->fe_type, p->fe_done,
>>>> +			ocfs2_filecheck_error(p->fe_status));
>>>> +		if (ret < 0) {
>>>> +			total = ret;
>>>> +			break;
>>>> +		}
>>>> +		if (ret == remain) {
>>>> +			/* snprintf() didn't fit */
>>>> +			total = -E2BIG;
>>>> +			break;
>>>> +		}
>>>> +		total += ret;
>>>> +		remain -= ret;
>>>> +	}
>>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>>> +
>>>> +	ocfs2_filecheck_sysfs_put(ent);
>>>> +	return total;
>>>> +}
>>>> +
>>>> +static int
>>>> +ocfs2_filecheck_erase_entry(struct ocfs2_filecheck_sysfs_entry *ent)
>>>> +{
>>>> +	struct ocfs2_filecheck_entry *p;
>>>> +
>>>> +	list_for_each_entry(p, &ent->fs_fcheck->fc_head, fe_list) {
>>>> +		if (p->fe_done) {
>>>> +			list_del(&p->fe_list);
>>>> +			kfree(p);
>>>> +			ent->fs_fcheck->fc_size--;
>>>> +			ent->fs_fcheck->fc_done--;
>>>> +			return 1;
>>>> +		}
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +static int
>>>> +ocfs2_filecheck_erase_entries(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				unsigned int count)
>>>> +{
>>>> +	unsigned int i = 0;
>>>> +	unsigned int ret = 0;
>>>> +
>>>> +	while (i++ < count) {
>>>> +		if (ocfs2_filecheck_erase_entry(ent))
>>>> +			ret++;
>>>> +		else
>>>> +			break;
>>>> +	}
>>>> +
>>>> +	return (ret == count ? 1 : 0);
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_done_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				struct ocfs2_filecheck_entry *entry)
>>>> +{
>>>> +	entry->fe_done = 1;
>>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>>> +	ent->fs_fcheck->fc_done++;
>>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>>> +}
>>>> +
>>>> +static unsigned short
>>>> +ocfs2_filecheck_handle(struct super_block *sb,
>>>> +				unsigned long ino, unsigned int flags)
>>>> +{
>>>> +	unsigned short ret = OCFS2_FILECHECK_ERR_SUCCESS;
>>>> +	struct inode *inode = NULL;
>>>> +	int rc;
>>>> +
>>>> +	inode = ocfs2_iget(OCFS2_SB(sb), ino, flags, 0);
>>>> +	if (IS_ERR(inode)) {
>>>> +		rc = (int)(-(long)inode);
>>>> +		if (rc >= OCFS2_FILECHECK_ERR_START &&
>>>> +			rc < OCFS2_FILECHECK_ERR_END)
>>>> +			ret = rc;
>>>> +		else
>>>> +			ret = OCFS2_FILECHECK_ERR_FAILED;
>>>> +	} else
>>>> +		iput(inode);
>>>> +
>>>> +	return ret;
>>>> +}
>>>> +
>>>> +static void
>>>> +ocfs2_filecheck_handle_entry(struct ocfs2_filecheck_sysfs_entry *ent,
>>>> +				struct ocfs2_filecheck_entry *entry)
>>>> +{
>>>> +	if (entry->fe_type == OCFS2_FILECHECK_TYPE_CHK)
>>>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>>>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_CHK);
>>>> +	else if (entry->fe_type == OCFS2_FILECHECK_TYPE_FIX)
>>>> +		entry->fe_status = ocfs2_filecheck_handle(ent->fs_sb,
>>>> +				entry->fe_ino, OCFS2_FI_FLAG_FILECHECK_FIX);
>>>> +	else
>>>> +		entry->fe_status = OCFS2_FILECHECK_ERR_UNSUPPORTED;
>>>> +
>>>> +	ocfs2_filecheck_done_entry(ent, entry);
>>>> +}
>>>> +
>>>> +static ssize_t ocfs2_filecheck_store(struct kobject *kobj,
>>>> +				struct kobj_attribute *attr,
>>>> +				const char *buf, size_t count)
>>>> +{
>>>> +	struct ocfs2_filecheck_args args;
>>>> +	struct ocfs2_filecheck_entry *entry = NULL;
>>>> +	struct ocfs2_filecheck_sysfs_entry *ent;
>>>> +	ssize_t ret = 0;
>>>> +
>>>> +	if (count == 0)
>>>> +		return count;
>>>> +
>>>> +	if (ocfs2_filecheck_args_parse(buf, count, &args)) {
>>>> +		mlog(ML_ERROR, "Invalid arguments for online file check\n");
>>>> +		return -EINVAL;
>>>> +	}
>>>> +
>>>> +	ent = ocfs2_filecheck_sysfs_get(kobj->name);
>>>> +	if (!ent) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Cannot get the corresponding entry via device basename %s\n",
>>>> +		kobj->name);
>>>> +		return -ENODEV;
>>>> +	}
>>>> +
>>>> +	if (args.fa_type == OCFS2_FILECHECK_TYPE_SET) {
>>>> +		ret = ocfs2_filecheck_adjust_max(ent, args.fa_len);
>>>> +		ocfs2_filecheck_sysfs_put(ent);
>>>> +		return (!ret ? count : ret);
>>>> +	}
>>>> +
>>>> +	spin_lock(&ent->fs_fcheck->fc_lock);
>>>> +	if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>>>> +		(ent->fs_fcheck->fc_done == 0)) {
>>>> +		mlog(ML_ERROR,
>>>> +		"Online file check queue(%u) is full\n",
>>>> +		ent->fs_fcheck->fc_max);
>>>> +		ret = -EBUSY;
>>>> +	} else {
>>>> +		if ((ent->fs_fcheck->fc_size >= ent->fs_fcheck->fc_max) &&
>>>> +			(ent->fs_fcheck->fc_done > 0)) {
>>>> +			/* Delete the oldest entry which was done,
>>>> +			 * make sure the entry size in list does
>>>> +			 * not exceed maximum value
>>>> +			 */
>>>> +			BUG_ON(!ocfs2_filecheck_erase_entry(ent));
>>>> +		}
>>>> +
>>>> +		entry = kmalloc(sizeof(struct ocfs2_filecheck_entry), GFP_NOFS);
>>>> +		if (entry) {
>>>> +			entry->fe_ino = args.fa_ino;
>>>> +			entry->fe_type = args.fa_type;
>>>> +			entry->fe_done = 0;
>>>> +			entry->fe_status = OCFS2_FILECHECK_ERR_INPROGRESS;
>>>> +			list_add_tail(&entry->fe_list,
>>>> +					&ent->fs_fcheck->fc_head);
>>>> +
>>>> +			ent->fs_fcheck->fc_size++;
>>>> +			ret = count;
>>>> +		} else {
>>>> +			ret = -ENOMEM;
>>>> +		}
>>>> +	}
>>>> +	spin_unlock(&ent->fs_fcheck->fc_lock);
>>>> +
>>>> +	if (entry)
>>>> +		ocfs2_filecheck_handle_entry(ent, entry);
>>>> +
>>>> +	ocfs2_filecheck_sysfs_put(ent);
>>>> +	return ret;
>>>> +}
>>>> diff --git a/fs/ocfs2/filecheck.h b/fs/ocfs2/filecheck.h
>>>> new file mode 100644
>>>> index 0000000..5ec331b
>>>> --- /dev/null
>>>> +++ b/fs/ocfs2/filecheck.h
>>>> @@ -0,0 +1,48 @@
>>>> +/* -*- mode: c; c-basic-offset: 8; -*-
>>>> + * vim: noexpandtab sw=8 ts=8 sts=0:
>>>> + *
>>>> + * filecheck.h
>>>> + *
>>>> + * Online file check.
>>>> + *
>>>> + * Copyright (C) 2015 Novell.  All rights reserved.
>>>> + *
>>>> + * This program is free software; you can redistribute it and/or
>>>> + * modify it under the terms of the GNU General Public
>>>> + * License as published by the Free Software Foundation, version 2.
>>>> + *
>>>> + * This program is distributed in the hope that it will be useful,
>>>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>>>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>>> + * General Public License for more details.
>>>> + */
>>>> +
>>>> +
>>>> +#ifndef FILECHECK_H
>>>> +#define FILECHECK_H
>>>> +
>>>> +#include <linux/types.h>
>>>> +#include <linux/list.h>
>>>> +
>>>> +
>>>> +/* File check errno */
>>>> +enum {
>>>> +	OCFS2_FILECHECK_ERR_SUCCESS = 0,	/* Success */
>>>> +	OCFS2_FILECHECK_ERR_FAILED = 1000,	/* Other failure */
>>>> +	OCFS2_FILECHECK_ERR_INPROGRESS,		/* In progress */
>>>> +	OCFS2_FILECHECK_ERR_READONLY,		/* Read only */
>>>> +	OCFS2_FILECHECK_ERR_INVALIDINO,		/* Invalid ino */
>>>> +	OCFS2_FILECHECK_ERR_BLOCKECC,		/* Block ecc */
>>>> +	OCFS2_FILECHECK_ERR_BLOCKNO,		/* Block number */
>>>> +	OCFS2_FILECHECK_ERR_VALIDFLAG,		/* Inode valid flag */
>>>> +	OCFS2_FILECHECK_ERR_GENERATION,		/* Inode generation */
>>>> +	OCFS2_FILECHECK_ERR_UNSUPPORTED		/* Unsupported */
>>>> +};
>>>> +
>>>> +#define OCFS2_FILECHECK_ERR_START	OCFS2_FILECHECK_ERR_FAILED
>>>> +#define OCFS2_FILECHECK_ERR_END		OCFS2_FILECHECK_ERR_UNSUPPORTED
>>>> +
>>>> +int ocfs2_filecheck_create_sysfs(struct super_block *sb);
>>>> +int ocfs2_filecheck_remove_sysfs(struct super_block *sb);
>>>> +
>>>> +#endif  /* FILECHECK_H */
>>>> diff --git a/fs/ocfs2/inode.h b/fs/ocfs2/inode.h
>>>> index 5e86b24..abd1018 100644
>>>> --- a/fs/ocfs2/inode.h
>>>> +++ b/fs/ocfs2/inode.h
>>>> @@ -139,6 +139,9 @@ int ocfs2_drop_inode(struct inode *inode);
>>>>  /* Flags for ocfs2_iget() */
>>>>  #define OCFS2_FI_FLAG_SYSFILE		0x1
>>>>  #define OCFS2_FI_FLAG_ORPHAN_RECOVERY	0x2
>>>> +#define OCFS2_FI_FLAG_FILECHECK_CHK	0x4
>>>> +#define OCFS2_FI_FLAG_FILECHECK_FIX	0x8
>>>> +
>>>>  struct inode *ocfs2_ilookup(struct super_block *sb, u64 feoff);
>>>>  struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 feoff, unsigned 
>>> flags,
>>>>  			 int sysfile_type);
>>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-11-03  8:29         ` [Ocfs2-devel] " Junxiao Bi
@ 2015-11-03  8:47           ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  8:47 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel




>>> 
> On 11/03/2015 04:15 PM, Gang He wrote:
>> Hello Junxiao,
>> 
>> See my comments inline.
>> 
>> 
>>>>>
>>> Hi Gang,
>>>
>>> This is not like a right patch.
>>> First, online file check only checks inode's block number, valid flag,
>>> fs generation value, and meta ecc. I never see a real corruption
>>> happened only on this field, if these fields are corrupted, that means
>>> something bad may happen on other place. So fix this field may not help
>>> and even cause corruption more hard.
>> This online file check/fix feature is used to check/fix some light file meta 
> block corruption, instead of turning a file system off and using fsck.ocfs2.
> What's light meta block corruption? Do you have a case about it?
>> e.g. meta ecc error, we really need not to use fsck.ocfs2. 
>> of course, this feature does not replace fsck.ocfs2 and touch some 
> complicated meta block problems, if there is some potential problem in some 
> areas, we can discuss them one by one.
>> 
>> 
>> 
>>> Second, the repair way is wrong. In
>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>> match the ones in memory, the ones in memory are used to update the disk
>>> fields. The question is how do you know these field in memory are
>>> right(they may be the real corrupted ones)?
>> Here, if the inode block was corrupted, the file system is not able to load 
> it into the memory.
> How do you know inode block corrupted? If bh for inode block is
> overwritten, i mean bh corrupted, the repair will corrupted a good inode
> block.
You know, the meta block is only validated when the file system loads the block from disk to memory.
If the inode object is in the memory, we consider this inode block is OK.
If the inode is not loaded by the file system via the normal way, the file system will print a kernel error log to tell which ino is corrupted.
we will use  ocfs2_filecheck_repair_inode_block() function to fix the inode block before loading.

Thanks
Gang

> 
> Thanks,
> Junxiao.
> 
>> ocfs2_filecheck_repair_inode_block() will able to load it into the memory, 
> since it try to fix these light-level problem before loading.
>> if the fix is OK, the changed meta-block can pass the block-validate function 
> and load into the memory as a inode object.
>> Since the file system is under a cluster environment, we have to use some 
> existing function and code path to keep these block operation under a cluster 
> lock.
>> 
>> 
>> Thanks
>> Gang
>> 
>>>
>>> Thanks,
>>> Junxiao.
>>> On 10/28/2015 02:26 PM, Gang He wrote:
>>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>>> +			       struct buffer_head *bh)
>>>> +{
>>>> +	int rc;
>>>> +	int changed = 0;
>>>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>> +
>>>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>>> +	/* Can't fix invalid inode block */
>>>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>>> +		return rc;
>>>> +
>>>> +	trace_ocfs2_filecheck_repair_inode_block(
>>>> +		(unsigned long long)bh->b_blocknr);
>>>> +
>>>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>>> +			(unsigned long long)bh->b_blocknr);
>>>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>>>> +	}
>>>> +
>>>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>>> +		changed = 1;
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>>> +			(unsigned long long)bh->b_blocknr,
>>>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>>>> +	}
>>>> +
>>>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>>> +		changed = 1;
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>>> +			(unsigned long long)bh->b_blocknr);
>>>> +	}
>>>> +
>>>> +	if (le32_to_cpu(di->i_fs_generation) !=
>>>> +	    OCFS2_SB(sb)->fs_generation) {
>>>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>>> +		changed = 1;
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>>> +			(unsigned long long)bh->b_blocknr,
>>>> +			le32_to_cpu(di->i_fs_generation));
>>>> +	}
>>>> +
>>>> +	if (changed ||
>>>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>>> +		mark_buffer_dirty(bh);
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>>>> +			(unsigned long long)bh->b_blocknr);
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-03  8:47           ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  8:47 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel




>>> 
> On 11/03/2015 04:15 PM, Gang He wrote:
>> Hello Junxiao,
>> 
>> See my comments inline.
>> 
>> 
>>>>>
>>> Hi Gang,
>>>
>>> This is not like a right patch.
>>> First, online file check only checks inode's block number, valid flag,
>>> fs generation value, and meta ecc. I never see a real corruption
>>> happened only on this field, if these fields are corrupted, that means
>>> something bad may happen on other place. So fix this field may not help
>>> and even cause corruption more hard.
>> This online file check/fix feature is used to check/fix some light file meta 
> block corruption, instead of turning a file system off and using fsck.ocfs2.
> What's light meta block corruption? Do you have a case about it?
>> e.g. meta ecc error, we really need not to use fsck.ocfs2. 
>> of course, this feature does not replace fsck.ocfs2 and touch some 
> complicated meta block problems, if there is some potential problem in some 
> areas, we can discuss them one by one.
>> 
>> 
>> 
>>> Second, the repair way is wrong. In
>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>> match the ones in memory, the ones in memory are used to update the disk
>>> fields. The question is how do you know these field in memory are
>>> right(they may be the real corrupted ones)?
>> Here, if the inode block was corrupted, the file system is not able to load 
> it into the memory.
> How do you know inode block corrupted? If bh for inode block is
> overwritten, i mean bh corrupted, the repair will corrupted a good inode
> block.
You know, the meta block is only validated when the file system loads the block from disk to memory.
If the inode object is in the memory, we consider this inode block is OK.
If the inode is not loaded by the file system via the normal way, the file system will print a kernel error log to tell which ino is corrupted.
we will use  ocfs2_filecheck_repair_inode_block() function to fix the inode block before loading.

Thanks
Gang

> 
> Thanks,
> Junxiao.
> 
>> ocfs2_filecheck_repair_inode_block() will able to load it into the memory, 
> since it try to fix these light-level problem before loading.
>> if the fix is OK, the changed meta-block can pass the block-validate function 
> and load into the memory as a inode object.
>> Since the file system is under a cluster environment, we have to use some 
> existing function and code path to keep these block operation under a cluster 
> lock.
>> 
>> 
>> Thanks
>> Gang
>> 
>>>
>>> Thanks,
>>> Junxiao.
>>> On 10/28/2015 02:26 PM, Gang He wrote:
>>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>>> +			       struct buffer_head *bh)
>>>> +{
>>>> +	int rc;
>>>> +	int changed = 0;
>>>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>> +
>>>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>>> +	/* Can't fix invalid inode block */
>>>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>>> +		return rc;
>>>> +
>>>> +	trace_ocfs2_filecheck_repair_inode_block(
>>>> +		(unsigned long long)bh->b_blocknr);
>>>> +
>>>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>>> +			(unsigned long long)bh->b_blocknr);
>>>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>>>> +	}
>>>> +
>>>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>>> +		changed = 1;
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>>> +			(unsigned long long)bh->b_blocknr,
>>>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>>>> +	}
>>>> +
>>>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>>> +		changed = 1;
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>>> +			(unsigned long long)bh->b_blocknr);
>>>> +	}
>>>> +
>>>> +	if (le32_to_cpu(di->i_fs_generation) !=
>>>> +	    OCFS2_SB(sb)->fs_generation) {
>>>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>>> +		changed = 1;
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>>> +			(unsigned long long)bh->b_blocknr,
>>>> +			le32_to_cpu(di->i_fs_generation));
>>>> +	}
>>>> +
>>>> +	if (changed ||
>>>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>>> +		mark_buffer_dirty(bh);
>>>> +		mlog(ML_ERROR,
>>>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>>>> +			(unsigned long long)bh->b_blocknr);
>>>> +	}
>>>> +
>>>> +	return 0;
>>>> +}
>> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-11-03  8:47           ` [Ocfs2-devel] " Gang He
@ 2015-11-03  9:01             ` Junxiao Bi
  -1 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  9:01 UTC (permalink / raw)
  To: Gang He, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

On 11/03/2015 04:47 PM, Gang He wrote:
> 
> 
> 
>>>>
>> On 11/03/2015 04:15 PM, Gang He wrote:
>>> Hello Junxiao,
>>>
>>> See my comments inline.
>>>
>>>
>>>>>>
>>>> Hi Gang,
>>>>
>>>> This is not like a right patch.
>>>> First, online file check only checks inode's block number, valid flag,
>>>> fs generation value, and meta ecc. I never see a real corruption
>>>> happened only on this field, if these fields are corrupted, that means
>>>> something bad may happen on other place. So fix this field may not help
>>>> and even cause corruption more hard.
>>> This online file check/fix feature is used to check/fix some light file meta 
>> block corruption, instead of turning a file system off and using fsck.ocfs2.
>> What's light meta block corruption? Do you have a case about it?
>>> e.g. meta ecc error, we really need not to use fsck.ocfs2. 
>>> of course, this feature does not replace fsck.ocfs2 and touch some 
>> complicated meta block problems, if there is some potential problem in some 
>> areas, we can discuss them one by one.
>>>
>>>
>>>
>>>> Second, the repair way is wrong. In
>>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>>> match the ones in memory, the ones in memory are used to update the disk
>>>> fields. The question is how do you know these field in memory are
>>>> right(they may be the real corrupted ones)?
>>> Here, if the inode block was corrupted, the file system is not able to load 
>> it into the memory.
>> How do you know inode block corrupted? If bh for inode block is
>> overwritten, i mean bh corrupted, the repair will corrupted a good inode
>> block.
> You know, the meta block is only validated when the file system loads the block from disk to memory.
> If the inode object is in the memory, we consider this inode block is OK.
This assuming is not true as there are always bugs. Bugs can make inode
object in memory bad and corrupted the fs when repair the inode.

Thanks,
Junxiao.
> If the inode is not loaded by the file system via the normal way, the file system will print a kernel error log to tell which ino is corrupted.
> we will use  ocfs2_filecheck_repair_inode_block() function to fix the inode block before loading.
> 
> Thanks
> Gang
> 
>>
>> Thanks,
>> Junxiao.
>>
>>> ocfs2_filecheck_repair_inode_block() will able to load it into the memory, 
>> since it try to fix these light-level problem before loading.
>>> if the fix is OK, the changed meta-block can pass the block-validate function 
>> and load into the memory as a inode object.
>>> Since the file system is under a cluster environment, we have to use some 
>> existing function and code path to keep these block operation under a cluster 
>> lock.
>>>
>>>
>>> Thanks
>>> Gang
>>>
>>>>
>>>> Thanks,
>>>> Junxiao.
>>>> On 10/28/2015 02:26 PM, Gang He wrote:
>>>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>>>> +			       struct buffer_head *bh)
>>>>> +{
>>>>> +	int rc;
>>>>> +	int changed = 0;
>>>>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>>> +
>>>>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>>>> +	/* Can't fix invalid inode block */
>>>>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>>>> +		return rc;
>>>>> +
>>>>> +	trace_ocfs2_filecheck_repair_inode_block(
>>>>> +		(unsigned long long)bh->b_blocknr);
>>>>> +
>>>>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>>>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>>>>> +	}
>>>>> +
>>>>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>>>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>>>> +		changed = 1;
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>>>> +			(unsigned long long)bh->b_blocknr,
>>>>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>>>>> +	}
>>>>> +
>>>>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>>>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>>>> +		changed = 1;
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>> +	}
>>>>> +
>>>>> +	if (le32_to_cpu(di->i_fs_generation) !=
>>>>> +	    OCFS2_SB(sb)->fs_generation) {
>>>>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>>>> +		changed = 1;
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>>>> +			(unsigned long long)bh->b_blocknr,
>>>>> +			le32_to_cpu(di->i_fs_generation));
>>>>> +	}
>>>>> +
>>>>> +	if (changed ||
>>>>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>>>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>>>> +		mark_buffer_dirty(bh);
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>> +	}
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-03  9:01             ` Junxiao Bi
  0 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-03  9:01 UTC (permalink / raw)
  To: Gang He, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel

On 11/03/2015 04:47 PM, Gang He wrote:
> 
> 
> 
>>>>
>> On 11/03/2015 04:15 PM, Gang He wrote:
>>> Hello Junxiao,
>>>
>>> See my comments inline.
>>>
>>>
>>>>>>
>>>> Hi Gang,
>>>>
>>>> This is not like a right patch.
>>>> First, online file check only checks inode's block number, valid flag,
>>>> fs generation value, and meta ecc. I never see a real corruption
>>>> happened only on this field, if these fields are corrupted, that means
>>>> something bad may happen on other place. So fix this field may not help
>>>> and even cause corruption more hard.
>>> This online file check/fix feature is used to check/fix some light file meta 
>> block corruption, instead of turning a file system off and using fsck.ocfs2.
>> What's light meta block corruption? Do you have a case about it?
>>> e.g. meta ecc error, we really need not to use fsck.ocfs2. 
>>> of course, this feature does not replace fsck.ocfs2 and touch some 
>> complicated meta block problems, if there is some potential problem in some 
>> areas, we can discuss them one by one.
>>>
>>>
>>>
>>>> Second, the repair way is wrong. In
>>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>>> match the ones in memory, the ones in memory are used to update the disk
>>>> fields. The question is how do you know these field in memory are
>>>> right(they may be the real corrupted ones)?
>>> Here, if the inode block was corrupted, the file system is not able to load 
>> it into the memory.
>> How do you know inode block corrupted? If bh for inode block is
>> overwritten, i mean bh corrupted, the repair will corrupted a good inode
>> block.
> You know, the meta block is only validated when the file system loads the block from disk to memory.
> If the inode object is in the memory, we consider this inode block is OK.
This assuming is not true as there are always bugs. Bugs can make inode
object in memory bad and corrupted the fs when repair the inode.

Thanks,
Junxiao.
> If the inode is not loaded by the file system via the normal way, the file system will print a kernel error log to tell which ino is corrupted.
> we will use  ocfs2_filecheck_repair_inode_block() function to fix the inode block before loading.
> 
> Thanks
> Gang
> 
>>
>> Thanks,
>> Junxiao.
>>
>>> ocfs2_filecheck_repair_inode_block() will able to load it into the memory, 
>> since it try to fix these light-level problem before loading.
>>> if the fix is OK, the changed meta-block can pass the block-validate function 
>> and load into the memory as a inode object.
>>> Since the file system is under a cluster environment, we have to use some 
>> existing function and code path to keep these block operation under a cluster 
>> lock.
>>>
>>>
>>> Thanks
>>> Gang
>>>
>>>>
>>>> Thanks,
>>>> Junxiao.
>>>> On 10/28/2015 02:26 PM, Gang He wrote:
>>>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>>>> +			       struct buffer_head *bh)
>>>>> +{
>>>>> +	int rc;
>>>>> +	int changed = 0;
>>>>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>>> +
>>>>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>>>> +	/* Can't fix invalid inode block */
>>>>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>>>> +		return rc;
>>>>> +
>>>>> +	trace_ocfs2_filecheck_repair_inode_block(
>>>>> +		(unsigned long long)bh->b_blocknr);
>>>>> +
>>>>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>>>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>>>>> +	}
>>>>> +
>>>>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>>>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>>>> +		changed = 1;
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>>>> +			(unsigned long long)bh->b_blocknr,
>>>>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>>>>> +	}
>>>>> +
>>>>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>>>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>>>> +		changed = 1;
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>> +	}
>>>>> +
>>>>> +	if (le32_to_cpu(di->i_fs_generation) !=
>>>>> +	    OCFS2_SB(sb)->fs_generation) {
>>>>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>>>> +		changed = 1;
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>>>> +			(unsigned long long)bh->b_blocknr,
>>>>> +			le32_to_cpu(di->i_fs_generation));
>>>>> +	}
>>>>> +
>>>>> +	if (changed ||
>>>>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>>>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>>>> +		mark_buffer_dirty(bh);
>>>>> +		mlog(ML_ERROR,
>>>>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>> +	}
>>>>> +
>>>>> +	return 0;
>>>>> +}
>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-11-03  9:01             ` [Ocfs2-devel] " Junxiao Bi
@ 2015-11-03  9:25               ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  9:25 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel




>>> 
> On 11/03/2015 04:47 PM, Gang He wrote:
>> 
>> 
>> 
>>>>>
>>> On 11/03/2015 04:15 PM, Gang He wrote:
>>>> Hello Junxiao,
>>>>
>>>> See my comments inline.
>>>>
>>>>
>>>>>>>
>>>>> Hi Gang,
>>>>>
>>>>> This is not like a right patch.
>>>>> First, online file check only checks inode's block number, valid flag,
>>>>> fs generation value, and meta ecc. I never see a real corruption
>>>>> happened only on this field, if these fields are corrupted, that means
>>>>> something bad may happen on other place. So fix this field may not help
>>>>> and even cause corruption more hard.
>>>> This online file check/fix feature is used to check/fix some light file meta 
> 
>>> block corruption, instead of turning a file system off and using fsck.ocfs2.
>>> What's light meta block corruption? Do you have a case about it?
>>>> e.g. meta ecc error, we really need not to use fsck.ocfs2. 
>>>> of course, this feature does not replace fsck.ocfs2 and touch some 
>>> complicated meta block problems, if there is some potential problem in some 
>>> areas, we can discuss them one by one.
>>>>
>>>>
>>>>
>>>>> Second, the repair way is wrong. In
>>>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>>>> match the ones in memory, the ones in memory are used to update the disk
>>>>> fields. The question is how do you know these field in memory are
>>>>> right(they may be the real corrupted ones)?
>>>> Here, if the inode block was corrupted, the file system is not able to load 
>>> it into the memory.
>>> How do you know inode block corrupted? If bh for inode block is
>>> overwritten, i mean bh corrupted, the repair will corrupted a good inode
>>> block.
>> You know, the meta block is only validated when the file system loads the 
> block from disk to memory.
>> If the inode object is in the memory, we consider this inode block is OK.
> This assuming is not true as there are always bugs. Bugs can make inode
> object in memory bad and corrupted the fs when repair the inode.
The inode object in the memory has probably been corrupted in some cases, right.
but in these cases, online file check/fix feature considers this inode object is validated, will not do any further modification and exit.
if the next corruption happens in case this inode is flushed into the disk, the bad thing is not made by online file check/fix code.
Next (maybe next mount), the inode block is reload into the memory by the file system will fail, with a kernel error log printing,
the online file check/fix probably can help to fix at this monent.

Thanks
Gang  
  
> 
> Thanks,
> Junxiao.
>> If the inode is not loaded by the file system via the normal way, the file 
> system will print a kernel error log to tell which ino is corrupted.
>> we will use  ocfs2_filecheck_repair_inode_block() function to fix the inode 
> block before loading.
>> 
>> Thanks
>> Gang
>> 
>>>
>>> Thanks,
>>> Junxiao.
>>>
>>>> ocfs2_filecheck_repair_inode_block() will able to load it into the memory, 
>>> since it try to fix these light-level problem before loading.
>>>> if the fix is OK, the changed meta-block can pass the block-validate function 
>>> and load into the memory as a inode object.
>>>> Since the file system is under a cluster environment, we have to use some 
>>> existing function and code path to keep these block operation under a 
> cluster 
>>> lock.
>>>>
>>>>
>>>> Thanks
>>>> Gang
>>>>
>>>>>
>>>>> Thanks,
>>>>> Junxiao.
>>>>> On 10/28/2015 02:26 PM, Gang He wrote:
>>>>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>>>>> +			       struct buffer_head *bh)
>>>>>> +{
>>>>>> +	int rc;
>>>>>> +	int changed = 0;
>>>>>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>>>> +
>>>>>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>>>>> +	/* Can't fix invalid inode block */
>>>>>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>>>>> +		return rc;
>>>>>> +
>>>>>> +	trace_ocfs2_filecheck_repair_inode_block(
>>>>>> +		(unsigned long long)bh->b_blocknr);
>>>>>> +
>>>>>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>>>>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>>>>>> +	}
>>>>>> +
>>>>>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>>>>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>>>>> +		changed = 1;
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>>>>> +			(unsigned long long)bh->b_blocknr,
>>>>>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>>>>>> +	}
>>>>>> +
>>>>>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>>>>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>>>>> +		changed = 1;
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>>> +	}
>>>>>> +
>>>>>> +	if (le32_to_cpu(di->i_fs_generation) !=
>>>>>> +	    OCFS2_SB(sb)->fs_generation) {
>>>>>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>>>>> +		changed = 1;
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>>>>> +			(unsigned long long)bh->b_blocknr,
>>>>>> +			le32_to_cpu(di->i_fs_generation));
>>>>>> +	}
>>>>>> +
>>>>>> +	if (changed ||
>>>>>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>>>>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>>>>> +		mark_buffer_dirty(bh);
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>>> +	}
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-03  9:25               ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-03  9:25 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh, rgoldwyn; +Cc: akpm, ocfs2-devel, linux-kernel




>>> 
> On 11/03/2015 04:47 PM, Gang He wrote:
>> 
>> 
>> 
>>>>>
>>> On 11/03/2015 04:15 PM, Gang He wrote:
>>>> Hello Junxiao,
>>>>
>>>> See my comments inline.
>>>>
>>>>
>>>>>>>
>>>>> Hi Gang,
>>>>>
>>>>> This is not like a right patch.
>>>>> First, online file check only checks inode's block number, valid flag,
>>>>> fs generation value, and meta ecc. I never see a real corruption
>>>>> happened only on this field, if these fields are corrupted, that means
>>>>> something bad may happen on other place. So fix this field may not help
>>>>> and even cause corruption more hard.
>>>> This online file check/fix feature is used to check/fix some light file meta 
> 
>>> block corruption, instead of turning a file system off and using fsck.ocfs2.
>>> What's light meta block corruption? Do you have a case about it?
>>>> e.g. meta ecc error, we really need not to use fsck.ocfs2. 
>>>> of course, this feature does not replace fsck.ocfs2 and touch some 
>>> complicated meta block problems, if there is some potential problem in some 
>>> areas, we can discuss them one by one.
>>>>
>>>>
>>>>
>>>>> Second, the repair way is wrong. In
>>>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>>>> match the ones in memory, the ones in memory are used to update the disk
>>>>> fields. The question is how do you know these field in memory are
>>>>> right(they may be the real corrupted ones)?
>>>> Here, if the inode block was corrupted, the file system is not able to load 
>>> it into the memory.
>>> How do you know inode block corrupted? If bh for inode block is
>>> overwritten, i mean bh corrupted, the repair will corrupted a good inode
>>> block.
>> You know, the meta block is only validated when the file system loads the 
> block from disk to memory.
>> If the inode object is in the memory, we consider this inode block is OK.
> This assuming is not true as there are always bugs. Bugs can make inode
> object in memory bad and corrupted the fs when repair the inode.
The inode object in the memory has probably been corrupted in some cases, right.
but in these cases, online file check/fix feature considers this inode object is validated, will not do any further modification and exit.
if the next corruption happens in case this inode is flushed into the disk, the bad thing is not made by online file check/fix code.
Next (maybe next mount), the inode block is reload into the memory by the file system will fail, with a kernel error log printing,
the online file check/fix probably can help to fix at this monent.

Thanks
Gang  
  
> 
> Thanks,
> Junxiao.
>> If the inode is not loaded by the file system via the normal way, the file 
> system will print a kernel error log to tell which ino is corrupted.
>> we will use  ocfs2_filecheck_repair_inode_block() function to fix the inode 
> block before loading.
>> 
>> Thanks
>> Gang
>> 
>>>
>>> Thanks,
>>> Junxiao.
>>>
>>>> ocfs2_filecheck_repair_inode_block() will able to load it into the memory, 
>>> since it try to fix these light-level problem before loading.
>>>> if the fix is OK, the changed meta-block can pass the block-validate function 
>>> and load into the memory as a inode object.
>>>> Since the file system is under a cluster environment, we have to use some 
>>> existing function and code path to keep these block operation under a 
> cluster 
>>> lock.
>>>>
>>>>
>>>> Thanks
>>>> Gang
>>>>
>>>>>
>>>>> Thanks,
>>>>> Junxiao.
>>>>> On 10/28/2015 02:26 PM, Gang He wrote:
>>>>>> +static int ocfs2_filecheck_repair_inode_block(struct super_block *sb,
>>>>>> +			       struct buffer_head *bh)
>>>>>> +{
>>>>>> +	int rc;
>>>>>> +	int changed = 0;
>>>>>> +	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>>>> +
>>>>>> +	rc = ocfs2_filecheck_validate_inode_block(sb, bh);
>>>>>> +	/* Can't fix invalid inode block */
>>>>>> +	if (!rc || rc == -OCFS2_FILECHECK_ERR_INVALIDINO)
>>>>>> +		return rc;
>>>>>> +
>>>>>> +	trace_ocfs2_filecheck_repair_inode_block(
>>>>>> +		(unsigned long long)bh->b_blocknr);
>>>>>> +
>>>>>> +	if (ocfs2_is_hard_readonly(OCFS2_SB(sb)) ||
>>>>>> +		ocfs2_is_soft_readonly(OCFS2_SB(sb))) {
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: try to repair dinode #%llu on readonly filesystem\n",
>>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>>> +		return -OCFS2_FILECHECK_ERR_READONLY;
>>>>>> +	}
>>>>>> +
>>>>>> +	if (le64_to_cpu(di->i_blkno) != bh->b_blocknr) {
>>>>>> +		di->i_blkno = cpu_to_le64(bh->b_blocknr);
>>>>>> +		changed = 1;
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: reset dinode #%llu: i_blkno to %llu\n",
>>>>>> +			(unsigned long long)bh->b_blocknr,
>>>>>> +			(unsigned long long)le64_to_cpu(di->i_blkno));
>>>>>> +	}
>>>>>> +
>>>>>> +	if (!(di->i_flags & cpu_to_le32(OCFS2_VALID_FL))) {
>>>>>> +		di->i_flags |= cpu_to_le32(OCFS2_VALID_FL);
>>>>>> +		changed = 1;
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: reset dinode #%llu: OCFS2_VALID_FL is set\n",
>>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>>> +	}
>>>>>> +
>>>>>> +	if (le32_to_cpu(di->i_fs_generation) !=
>>>>>> +	    OCFS2_SB(sb)->fs_generation) {
>>>>>> +		di->i_fs_generation = cpu_to_le32(OCFS2_SB(sb)->fs_generation);
>>>>>> +		changed = 1;
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: reset dinode #%llu: fs_generation to %u\n",
>>>>>> +			(unsigned long long)bh->b_blocknr,
>>>>>> +			le32_to_cpu(di->i_fs_generation));
>>>>>> +	}
>>>>>> +
>>>>>> +	if (changed ||
>>>>>> +		ocfs2_validate_meta_ecc(sb, bh->b_data, &di->i_check)) {
>>>>>> +		ocfs2_compute_meta_ecc(sb, bh->b_data, &di->i_check);
>>>>>> +		mark_buffer_dirty(bh);
>>>>>> +		mlog(ML_ERROR,
>>>>>> +			"Filecheck: reset dinode #%llu: compute meta ecc\n",
>>>>>> +			(unsigned long long)bh->b_blocknr);
>>>>>> +	}
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-03  8:20         ` [Ocfs2-devel] " Junxiao Bi
@ 2015-11-24 21:46           ` Mark Fasheh
  -1 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 21:46 UTC (permalink / raw)
  To: Junxiao Bi; +Cc: Gang He, rgoldwyn, akpm, ocfs2-devel, linux-kernel

On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
> Hi Gang,
> 
> On 11/03/2015 03:54 PM, Gang He wrote:
> > Hi Junxiao,
> > 
> > Thank for your reviewing.
> > Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
> > But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
> > Why?
> > 1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
> If user don't want this, they should not use error=continue option, let
> fs go after a corruption is very dangerous.

Maybe we need another errors=XXX flag (maybe errors=fix)?

You both make good points, here's what I gather from the conversation:

 - Some customers would be sad if they have to manually fix corruptions.
   This takes effort on their part, and if the FS can handle it
   automatically, it should.

 - There are valid concerns that automatically fixing things is a change in
   behavior that might not be welcome, or worse might lead to unforseeable
   circumstances.

 - I will add that fixing things automatically implies checking them
   automatically which could introduce some performance impact depending on
   how much checking we're doing.

So if the user wants errors to be fixed automatically, they could mount with
errros=fix, and everyone else would have no change in behavior unless they
wanted to make use of the new feature.


> > 2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
> I think if this feature could bring more corruption, then this should be
> fixed first.

Btw, I am pretty sure that Gang is referring to the feature being new and
thus more likely to have problems. There is nothing I see in here that is
file system corrupting.
	--Mark


--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-24 21:46           ` Mark Fasheh
  0 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 21:46 UTC (permalink / raw)
  To: Junxiao Bi; +Cc: Gang He, rgoldwyn, akpm, ocfs2-devel, linux-kernel

On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
> Hi Gang,
> 
> On 11/03/2015 03:54 PM, Gang He wrote:
> > Hi Junxiao,
> > 
> > Thank for your reviewing.
> > Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
> > But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
> > Why?
> > 1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
> If user don't want this, they should not use error=continue option, let
> fs go after a corruption is very dangerous.

Maybe we need another errors=XXX flag (maybe errors=fix)?

You both make good points, here's what I gather from the conversation:

 - Some customers would be sad if they have to manually fix corruptions.
   This takes effort on their part, and if the FS can handle it
   automatically, it should.

 - There are valid concerns that automatically fixing things is a change in
   behavior that might not be welcome, or worse might lead to unforseeable
   circumstances.

 - I will add that fixing things automatically implies checking them
   automatically which could introduce some performance impact depending on
   how much checking we're doing.

So if the user wants errors to be fixed automatically, they could mount with
errros=fix, and everyone else would have no change in behavior unless they
wanted to make use of the new feature.


> > 2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
> I think if this feature could bring more corruption, then this should be
> fixed first.

Btw, I am pretty sure that Gang is referring to the feature being new and
thus more likely to have problems. There is nothing I see in here that is
file system corrupting.
	--Mark


--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 1/4] ocfs2: export ocfs2_kset for online file check
  2015-10-28  6:25   ` [Ocfs2-devel] " Gang He
@ 2015-11-24 21:47     ` Mark Fasheh
  -1 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 21:47 UTC (permalink / raw)
  To: Gang He; +Cc: rgoldwyn, linux-kernel, ocfs2-devel, akpm

On Wed, Oct 28, 2015 at 02:25:58PM +0800, Gang He wrote:
> Export ocfs2_kset object from ocfs2_stackglue kernel module,
> then online file check code will create the related sysfiles
> under ocfs2_kset object.
> 
> Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 1/4] ocfs2: export ocfs2_kset for online file check
@ 2015-11-24 21:47     ` Mark Fasheh
  0 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 21:47 UTC (permalink / raw)
  To: Gang He; +Cc: rgoldwyn, linux-kernel, ocfs2-devel, akpm

On Wed, Oct 28, 2015 at 02:25:58PM +0800, Gang He wrote:
> Export ocfs2_kset object from ocfs2_stackglue kernel module,
> then online file check code will create the related sysfiles
> under ocfs2_kset object.
> 
> Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-10-28  6:25   ` [Ocfs2-devel] " Gang He
@ 2015-11-24 21:52     ` Mark Fasheh
  -1 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 21:52 UTC (permalink / raw)
  To: Gang He; +Cc: rgoldwyn, linux-kernel, ocfs2-devel

On Wed, Oct 28, 2015 at 02:25:59PM +0800, Gang He wrote:
> Implement online file check sysfile interfaces, e.g.
> how to create the related sysfile according to device name,
> how to display/handle file check request from the sysfile.
> 
> Signed-off-by: Gang He <ghe@suse.com>

FYI, This looks generally fine to me however we should address Junxiao's concerns
before it goes any further.
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-24 21:52     ` Mark Fasheh
  0 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 21:52 UTC (permalink / raw)
  To: Gang He; +Cc: rgoldwyn, linux-kernel, ocfs2-devel

On Wed, Oct 28, 2015 at 02:25:59PM +0800, Gang He wrote:
> Implement online file check sysfile interfaces, e.g.
> how to create the related sysfile according to device name,
> how to display/handle file check request from the sysfile.
> 
> Signed-off-by: Gang He <ghe@suse.com>

FYI, This looks generally fine to me however we should address Junxiao's concerns
before it goes any further.
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 3/4] ocfs2: create/remove sysfile for online file check
  2015-10-28  6:26   ` [Ocfs2-devel] " Gang He
@ 2015-11-24 21:53     ` Mark Fasheh
  -1 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 21:53 UTC (permalink / raw)
  To: Gang He; +Cc: rgoldwyn, linux-kernel, ocfs2-devel, akpm

On Wed, Oct 28, 2015 at 02:26:00PM +0800, Gang He wrote:
> Create online file check sysfile when ocfs2 mount,
> remove the related sysfile when ocfs2 umount.
> 
> Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 3/4] ocfs2: create/remove sysfile for online file check
@ 2015-11-24 21:53     ` Mark Fasheh
  0 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 21:53 UTC (permalink / raw)
  To: Gang He; +Cc: rgoldwyn, linux-kernel, ocfs2-devel, akpm

On Wed, Oct 28, 2015 at 02:26:00PM +0800, Gang He wrote:
> Create online file check sysfile when ocfs2 mount,
> remove the related sysfile when ocfs2 umount.
> 
> Signed-off-by: Gang He <ghe@suse.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-24 21:46           ` [Ocfs2-devel] " Mark Fasheh
@ 2015-11-24 21:55             ` Srinivas Eeda
  -1 siblings, 0 replies; 80+ messages in thread
From: Srinivas Eeda @ 2015-11-24 21:55 UTC (permalink / raw)
  To: Mark Fasheh, Junxiao Bi; +Cc: linux-kernel, ocfs2-devel

On 11/24/2015 01:46 PM, Mark Fasheh wrote:
> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> On 11/03/2015 03:54 PM, Gang He wrote:
>>> Hi Junxiao,
>>>
>>> Thank for your reviewing.
>>> Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
>>> But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
>>> Why?
>>> 1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
>> If user don't want this, they should not use error=continue option, let
>> fs go after a corruption is very dangerous.
> Maybe we need another errors=XXX flag (maybe errors=fix)?
Great idea Mark! I think adding errors=fix would be a good way to 
address both concerns :) It gives some control if anyone is 
uncomfortable of things getting checked/fixed automatically.

>
> You both make good points, here's what I gather from the conversation:
>
>   - Some customers would be sad if they have to manually fix corruptions.
>     This takes effort on their part, and if the FS can handle it
>     automatically, it should.
>
>   - There are valid concerns that automatically fixing things is a change in
>     behavior that might not be welcome, or worse might lead to unforseeable
>     circumstances.
>
>   - I will add that fixing things automatically implies checking them
>     automatically which could introduce some performance impact depending on
>     how much checking we're doing.
>
> So if the user wants errors to be fixed automatically, they could mount with
> errros=fix, and everyone else would have no change in behavior unless they
> wanted to make use of the new feature.
>
>
>>> 2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
>> I think if this feature could bring more corruption, then this should be
>> fixed first.
> Btw, I am pretty sure that Gang is referring to the feature being new and
> thus more likely to have problems. There is nothing I see in here that is
> file system corrupting.
> 	--Mark
>
>
> --
> Mark Fasheh
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel@oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-24 21:55             ` Srinivas Eeda
  0 siblings, 0 replies; 80+ messages in thread
From: Srinivas Eeda @ 2015-11-24 21:55 UTC (permalink / raw)
  To: Mark Fasheh, Junxiao Bi; +Cc: linux-kernel, ocfs2-devel

On 11/24/2015 01:46 PM, Mark Fasheh wrote:
> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> On 11/03/2015 03:54 PM, Gang He wrote:
>>> Hi Junxiao,
>>>
>>> Thank for your reviewing.
>>> Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
>>> But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
>>> Why?
>>> 1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
>> If user don't want this, they should not use error=continue option, let
>> fs go after a corruption is very dangerous.
> Maybe we need another errors=XXX flag (maybe errors=fix)?
Great idea Mark! I think adding errors=fix would be a good way to 
address both concerns :) It gives some control if anyone is 
uncomfortable of things getting checked/fixed automatically.

>
> You both make good points, here's what I gather from the conversation:
>
>   - Some customers would be sad if they have to manually fix corruptions.
>     This takes effort on their part, and if the FS can handle it
>     automatically, it should.
>
>   - There are valid concerns that automatically fixing things is a change in
>     behavior that might not be welcome, or worse might lead to unforseeable
>     circumstances.
>
>   - I will add that fixing things automatically implies checking them
>     automatically which could introduce some performance impact depending on
>     how much checking we're doing.
>
> So if the user wants errors to be fixed automatically, they could mount with
> errros=fix, and everyone else would have no change in behavior unless they
> wanted to make use of the new feature.
>
>
>>> 2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
>> I think if this feature could bring more corruption, then this should be
>> fixed first.
> Btw, I am pretty sure that Gang is referring to the feature being new and
> thus more likely to have problems. There is nothing I see in here that is
> file system corrupting.
> 	--Mark
>
>
> --
> Mark Fasheh
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-11-03  7:12     ` [Ocfs2-devel] " Junxiao Bi
@ 2015-11-24 22:16       ` Mark Fasheh
  -1 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 22:16 UTC (permalink / raw)
  To: Junxiao Bi; +Cc: Gang He, rgoldwyn, linux-kernel, ocfs2-devel

Hi Junxiao,

On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
> Hi Gang,
> 
> This is not like a right patch.
> First, online file check only checks inode's block number, valid flag,
> fs generation value, and meta ecc. I never see a real corruption
> happened only on this field, if these fields are corrupted, that means
> something bad may happen on other place. So fix this field may not help
> and even cause corruption more hard.

I agree that these are rather uncommon, we might even consider removing the
VALID_FL fixup. I definitely don't think we're ready for anything more
complicated than this though either. We kind of have to start somewhere too.


> Second, the repair way is wrong. In
> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
> match the ones in memory, the ones in memory are used to update the disk
> fields. The question is how do you know these field in memory are
> right(they may be the real corrupted ones)?

Your second point (and the last part of your 1st point) makes a good
argument for why this shouldn't happen automatically. Some of these
corruptions might require a human to look at the log and decide what to do.
Especially as you point out, where we might not know where the source of the
corruption is. And if the human can't figure it out, then it's probably time
to unmount and fsck.

Thanks,
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-24 22:16       ` Mark Fasheh
  0 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-11-24 22:16 UTC (permalink / raw)
  To: Junxiao Bi; +Cc: Gang He, rgoldwyn, linux-kernel, ocfs2-devel

Hi Junxiao,

On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
> Hi Gang,
> 
> This is not like a right patch.
> First, online file check only checks inode's block number, valid flag,
> fs generation value, and meta ecc. I never see a real corruption
> happened only on this field, if these fields are corrupted, that means
> something bad may happen on other place. So fix this field may not help
> and even cause corruption more hard.

I agree that these are rather uncommon, we might even consider removing the
VALID_FL fixup. I definitely don't think we're ready for anything more
complicated than this though either. We kind of have to start somewhere too.


> Second, the repair way is wrong. In
> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
> match the ones in memory, the ones in memory are used to update the disk
> fields. The question is how do you know these field in memory are
> right(they may be the real corrupted ones)?

Your second point (and the last part of your 1st point) makes a good
argument for why this shouldn't happen automatically. Some of these
corruptions might require a human to look at the log and decide what to do.
Especially as you point out, where we might not know where the source of the
corruption is. And if the human can't figure it out, then it's probably time
to unmount and fsck.

Thanks,
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-24 21:46           ` [Ocfs2-devel] " Mark Fasheh
@ 2015-11-25  3:29             ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-25  3:29 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh; +Cc: akpm, ocfs2-devel, rgoldwyn, linux-kernel

Hi Mark and Junxiao,


>>> 
> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>> 
>> On 11/03/2015 03:54 PM, Gang He wrote:
>> > Hi Junxiao,
>> > 
>> > Thank for your reviewing.
>> > Current design, we use a sysfile as a interface to check/fix a file (via 
> pass a ino number).
>> > But, this operation is manually triggered by user, instead of automatically 
>  fix in the kernel.
>> > Why?
>> > 1) we should let users make this decision, since some users do not want to 
> fix when encountering a file system corruption, maybe they want to keep the 
> file system unchanged for a further investigation.
>> If user don't want this, they should not use error=continue option, let
>> fs go after a corruption is very dangerous.
> 
> Maybe we need another errors=XXX flag (maybe errors=fix)?
> 
> You both make good points, here's what I gather from the conversation:
> 
>  - Some customers would be sad if they have to manually fix corruptions.
>    This takes effort on their part, and if the FS can handle it
>    automatically, it should.
> 
>  - There are valid concerns that automatically fixing things is a change in
>    behavior that might not be welcome, or worse might lead to unforseeable
>    circumstances.
> 
>  - I will add that fixing things automatically implies checking them
>    automatically which could introduce some performance impact depending on
>    how much checking we're doing.
> 
> So if the user wants errors to be fixed automatically, they could mount with
> errros=fix, and everyone else would have no change in behavior unless they
> wanted to make use of the new feature.
That is what I want to say, add a mount option to let users to decide. Here, I want to split "error=fix"
mount option  task out from online file check feature, I think this part should be a independent feature.
We can implement this feature after online file check is done, I want to split the feature into some more 
detailed features, implement them one by one. Do you agree this point?

> 
> 
>> > 2) frankly speaking, this feature will probably bring a second corruption 
> if there is some error in the code, I do not suggest to use automatically fix 
> by default in the first version.
>> I think if this feature could bring more corruption, then this should be
>> fixed first.
> 
> Btw, I am pretty sure that Gang is referring to the feature being new and
> thus more likely to have problems. There is nothing I see in here that is
> file system corrupting.
> 	--Mark
> 
> 
> --
> Mark Fasheh


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-25  3:29             ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-25  3:29 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh; +Cc: akpm, ocfs2-devel, rgoldwyn, linux-kernel

Hi Mark and Junxiao,


>>> 
> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>> 
>> On 11/03/2015 03:54 PM, Gang He wrote:
>> > Hi Junxiao,
>> > 
>> > Thank for your reviewing.
>> > Current design, we use a sysfile as a interface to check/fix a file (via 
> pass a ino number).
>> > But, this operation is manually triggered by user, instead of automatically 
>  fix in the kernel.
>> > Why?
>> > 1) we should let users make this decision, since some users do not want to 
> fix when encountering a file system corruption, maybe they want to keep the 
> file system unchanged for a further investigation.
>> If user don't want this, they should not use error=continue option, let
>> fs go after a corruption is very dangerous.
> 
> Maybe we need another errors=XXX flag (maybe errors=fix)?
> 
> You both make good points, here's what I gather from the conversation:
> 
>  - Some customers would be sad if they have to manually fix corruptions.
>    This takes effort on their part, and if the FS can handle it
>    automatically, it should.
> 
>  - There are valid concerns that automatically fixing things is a change in
>    behavior that might not be welcome, or worse might lead to unforseeable
>    circumstances.
> 
>  - I will add that fixing things automatically implies checking them
>    automatically which could introduce some performance impact depending on
>    how much checking we're doing.
> 
> So if the user wants errors to be fixed automatically, they could mount with
> errros=fix, and everyone else would have no change in behavior unless they
> wanted to make use of the new feature.
That is what I want to say, add a mount option to let users to decide. Here, I want to split "error=fix"
mount option  task out from online file check feature, I think this part should be a independent feature.
We can implement this feature after online file check is done, I want to split the feature into some more 
detailed features, implement them one by one. Do you agree this point?

> 
> 
>> > 2) frankly speaking, this feature will probably bring a second corruption 
> if there is some error in the code, I do not suggest to use automatically fix 
> by default in the first version.
>> I think if this feature could bring more corruption, then this should be
>> fixed first.
> 
> Btw, I am pretty sure that Gang is referring to the feature being new and
> thus more likely to have problems. There is nothing I see in here that is
> file system corrupting.
> 	--Mark
> 
> 
> --
> Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-11-24 22:16       ` Mark Fasheh
@ 2015-11-25  4:11         ` Junxiao Bi
  -1 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-25  4:11 UTC (permalink / raw)
  To: Mark Fasheh; +Cc: Gang He, rgoldwyn, linux-kernel, ocfs2-devel

Hi Mark,

On 11/25/2015 06:16 AM, Mark Fasheh wrote:
> Hi Junxiao,
> 
> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> This is not like a right patch.
>> First, online file check only checks inode's block number, valid flag,
>> fs generation value, and meta ecc. I never see a real corruption
>> happened only on this field, if these fields are corrupted, that means
>> something bad may happen on other place. So fix this field may not help
>> and even cause corruption more hard.
> 
> I agree that these are rather uncommon, we might even consider removing the
> VALID_FL fixup. I definitely don't think we're ready for anything more
> complicated than this though either. We kind of have to start somewhere too.
> 
Yes, the fix is too simple, and just a start, I think we'd better wait
more useful parts done before merging it.
> 
>> Second, the repair way is wrong. In
>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>> match the ones in memory, the ones in memory are used to update the disk
>> fields. The question is how do you know these field in memory are
>> right(they may be the real corrupted ones)?
> 
> Your second point (and the last part of your 1st point) makes a good
> argument for why this shouldn't happen automatically. Some of these
> corruptions might require a human to look at the log and decide what to do.
> Especially as you point out, where we might not know where the source of the
> corruption is. And if the human can't figure it out, then it's probably time
> to unmount and fsck.
The point is that the fix way is wrong, just flush memory info to disk
is not right. I agree online fsck is good feature, but need carefully
design, it should not involve more corruptions. A rough idea from mine
is that maybe we need some "frezee" mechanism in fs, which can hung all
fs op and let fs stop at a safe area. After freeze fs, we can do some
fsck work on it and these works should not cost lots time. What's your idea?

Thanks,
Junxiao.

> 
> Thanks,
> 	--Mark
> 
> --
> Mark Fasheh
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-25  4:11         ` Junxiao Bi
  0 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-25  4:11 UTC (permalink / raw)
  To: Mark Fasheh; +Cc: Gang He, rgoldwyn, linux-kernel, ocfs2-devel

Hi Mark,

On 11/25/2015 06:16 AM, Mark Fasheh wrote:
> Hi Junxiao,
> 
> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> This is not like a right patch.
>> First, online file check only checks inode's block number, valid flag,
>> fs generation value, and meta ecc. I never see a real corruption
>> happened only on this field, if these fields are corrupted, that means
>> something bad may happen on other place. So fix this field may not help
>> and even cause corruption more hard.
> 
> I agree that these are rather uncommon, we might even consider removing the
> VALID_FL fixup. I definitely don't think we're ready for anything more
> complicated than this though either. We kind of have to start somewhere too.
> 
Yes, the fix is too simple, and just a start, I think we'd better wait
more useful parts done before merging it.
> 
>> Second, the repair way is wrong. In
>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>> match the ones in memory, the ones in memory are used to update the disk
>> fields. The question is how do you know these field in memory are
>> right(they may be the real corrupted ones)?
> 
> Your second point (and the last part of your 1st point) makes a good
> argument for why this shouldn't happen automatically. Some of these
> corruptions might require a human to look at the log and decide what to do.
> Especially as you point out, where we might not know where the source of the
> corruption is. And if the human can't figure it out, then it's probably time
> to unmount and fsck.
The point is that the fix way is wrong, just flush memory info to disk
is not right. I agree online fsck is good feature, but need carefully
design, it should not involve more corruptions. A rough idea from mine
is that maybe we need some "frezee" mechanism in fs, which can hung all
fs op and let fs stop at a safe area. After freeze fs, we can do some
fsck work on it and these works should not cost lots time. What's your idea?

Thanks,
Junxiao.

> 
> Thanks,
> 	--Mark
> 
> --
> Mark Fasheh
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-24 21:46           ` [Ocfs2-devel] " Mark Fasheh
@ 2015-11-25  4:33             ` Junxiao Bi
  -1 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-25  4:33 UTC (permalink / raw)
  To: Mark Fasheh; +Cc: Gang He, rgoldwyn, akpm, ocfs2-devel, linux-kernel

On 11/25/2015 05:46 AM, Mark Fasheh wrote:
> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> On 11/03/2015 03:54 PM, Gang He wrote:
>>> Hi Junxiao,
>>>
>>> Thank for your reviewing.
>>> Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
>>> But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
>>> Why?
>>> 1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
>> If user don't want this, they should not use error=continue option, let
>> fs go after a corruption is very dangerous.
> 
> Maybe we need another errors=XXX flag (maybe errors=fix)?
Sound great. This is a good option since user may have not enough
knowledge whether to fix the found issue.

Thanks,
Junxiao.
> 
> You both make good points, here's what I gather from the conversation:
> 
>  - Some customers would be sad if they have to manually fix corruptions.
>    This takes effort on their part, and if the FS can handle it
>    automatically, it should.
> 
>  - There are valid concerns that automatically fixing things is a change in
>    behavior that might not be welcome, or worse might lead to unforseeable
>    circumstances.
> 
>  - I will add that fixing things automatically implies checking them
>    automatically which could introduce some performance impact depending on
>    how much checking we're doing.
> 
> So if the user wants errors to be fixed automatically, they could mount with
> errros=fix, and everyone else would have no change in behavior unless they
> wanted to make use of the new feature.
> 
> 
>>> 2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
>> I think if this feature could bring more corruption, then this should be
>> fixed first.
> 
> Btw, I am pretty sure that Gang is referring to the feature being new and
> thus more likely to have problems. There is nothing I see in here that is
> file system corrupting.
> 	--Mark
> 
> 
> --
> Mark Fasheh
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-25  4:33             ` Junxiao Bi
  0 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-25  4:33 UTC (permalink / raw)
  To: Mark Fasheh; +Cc: Gang He, rgoldwyn, akpm, ocfs2-devel, linux-kernel

On 11/25/2015 05:46 AM, Mark Fasheh wrote:
> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>> Hi Gang,
>>
>> On 11/03/2015 03:54 PM, Gang He wrote:
>>> Hi Junxiao,
>>>
>>> Thank for your reviewing.
>>> Current design, we use a sysfile as a interface to check/fix a file (via pass a ino number).
>>> But, this operation is manually triggered by user, instead of automatically  fix in the kernel.
>>> Why?
>>> 1) we should let users make this decision, since some users do not want to fix when encountering a file system corruption, maybe they want to keep the file system unchanged for a further investigation.
>> If user don't want this, they should not use error=continue option, let
>> fs go after a corruption is very dangerous.
> 
> Maybe we need another errors=XXX flag (maybe errors=fix)?
Sound great. This is a good option since user may have not enough
knowledge whether to fix the found issue.

Thanks,
Junxiao.
> 
> You both make good points, here's what I gather from the conversation:
> 
>  - Some customers would be sad if they have to manually fix corruptions.
>    This takes effort on their part, and if the FS can handle it
>    automatically, it should.
> 
>  - There are valid concerns that automatically fixing things is a change in
>    behavior that might not be welcome, or worse might lead to unforseeable
>    circumstances.
> 
>  - I will add that fixing things automatically implies checking them
>    automatically which could introduce some performance impact depending on
>    how much checking we're doing.
> 
> So if the user wants errors to be fixed automatically, they could mount with
> errros=fix, and everyone else would have no change in behavior unless they
> wanted to make use of the new feature.
> 
> 
>>> 2) frankly speaking, this feature will probably bring a second corruption if there is some error in the code, I do not suggest to use automatically fix by default in the first version.
>> I think if this feature could bring more corruption, then this should be
>> fixed first.
> 
> Btw, I am pretty sure that Gang is referring to the feature being new and
> thus more likely to have problems. There is nothing I see in here that is
> file system corrupting.
> 	--Mark
> 
> 
> --
> Mark Fasheh
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-25  3:29             ` [Ocfs2-devel] " Gang He
@ 2015-11-25  4:43               ` Junxiao Bi
  -1 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-25  4:43 UTC (permalink / raw)
  To: Gang He, Mark Fasheh; +Cc: akpm, ocfs2-devel, rgoldwyn, linux-kernel

Hi Gang,

On 11/25/2015 11:29 AM, Gang He wrote:
> Hi Mark and Junxiao,
> 
> 
>>>>
>> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>>> Hi Gang,
>>>
>>> On 11/03/2015 03:54 PM, Gang He wrote:
>>>> Hi Junxiao,
>>>>
>>>> Thank for your reviewing.
>>>> Current design, we use a sysfile as a interface to check/fix a file (via 
>> pass a ino number).
>>>> But, this operation is manually triggered by user, instead of automatically 
>>  fix in the kernel.
>>>> Why?
>>>> 1) we should let users make this decision, since some users do not want to 
>> fix when encountering a file system corruption, maybe they want to keep the 
>> file system unchanged for a further investigation.
>>> If user don't want this, they should not use error=continue option, let
>>> fs go after a corruption is very dangerous.
>>
>> Maybe we need another errors=XXX flag (maybe errors=fix)?
>>
>> You both make good points, here's what I gather from the conversation:
>>
>>  - Some customers would be sad if they have to manually fix corruptions.
>>    This takes effort on their part, and if the FS can handle it
>>    automatically, it should.
>>
>>  - There are valid concerns that automatically fixing things is a change in
>>    behavior that might not be welcome, or worse might lead to unforseeable
>>    circumstances.
>>
>>  - I will add that fixing things automatically implies checking them
>>    automatically which could introduce some performance impact depending on
>>    how much checking we're doing.
>>
>> So if the user wants errors to be fixed automatically, they could mount with
>> errros=fix, and everyone else would have no change in behavior unless they
>> wanted to make use of the new feature.
> That is what I want to say, add a mount option to let users to decide. Here, I want to split "error=fix"
> mount option  task out from online file check feature, I think this part should be a independent feature.
> We can implement this feature after online file check is done, I want to split the feature into some more 
> detailed features, implement them one by one. Do you agree this point?
With error=fix, when a possible corruption is found, online fsck will
start to check and fix things. So this doesn't looks like a independent
feature.

Thanks,
Junxiao.

> 
>>
>>
>>>> 2) frankly speaking, this feature will probably bring a second corruption 
>> if there is some error in the code, I do not suggest to use automatically fix 
>> by default in the first version.
>>> I think if this feature could bring more corruption, then this should be
>>> fixed first.
>>
>> Btw, I am pretty sure that Gang is referring to the feature being new and
>> thus more likely to have problems. There is nothing I see in here that is
>> file system corrupting.
>> 	--Mark
>>
>>
>> --
>> Mark Fasheh
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-25  4:43               ` Junxiao Bi
  0 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-25  4:43 UTC (permalink / raw)
  To: Gang He, Mark Fasheh; +Cc: akpm, ocfs2-devel, rgoldwyn, linux-kernel

Hi Gang,

On 11/25/2015 11:29 AM, Gang He wrote:
> Hi Mark and Junxiao,
> 
> 
>>>>
>> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>>> Hi Gang,
>>>
>>> On 11/03/2015 03:54 PM, Gang He wrote:
>>>> Hi Junxiao,
>>>>
>>>> Thank for your reviewing.
>>>> Current design, we use a sysfile as a interface to check/fix a file (via 
>> pass a ino number).
>>>> But, this operation is manually triggered by user, instead of automatically 
>>  fix in the kernel.
>>>> Why?
>>>> 1) we should let users make this decision, since some users do not want to 
>> fix when encountering a file system corruption, maybe they want to keep the 
>> file system unchanged for a further investigation.
>>> If user don't want this, they should not use error=continue option, let
>>> fs go after a corruption is very dangerous.
>>
>> Maybe we need another errors=XXX flag (maybe errors=fix)?
>>
>> You both make good points, here's what I gather from the conversation:
>>
>>  - Some customers would be sad if they have to manually fix corruptions.
>>    This takes effort on their part, and if the FS can handle it
>>    automatically, it should.
>>
>>  - There are valid concerns that automatically fixing things is a change in
>>    behavior that might not be welcome, or worse might lead to unforseeable
>>    circumstances.
>>
>>  - I will add that fixing things automatically implies checking them
>>    automatically which could introduce some performance impact depending on
>>    how much checking we're doing.
>>
>> So if the user wants errors to be fixed automatically, they could mount with
>> errros=fix, and everyone else would have no change in behavior unless they
>> wanted to make use of the new feature.
> That is what I want to say, add a mount option to let users to decide. Here, I want to split "error=fix"
> mount option  task out from online file check feature, I think this part should be a independent feature.
> We can implement this feature after online file check is done, I want to split the feature into some more 
> detailed features, implement them one by one. Do you agree this point?
With error=fix, when a possible corruption is found, online fsck will
start to check and fix things. So this doesn't looks like a independent
feature.

Thanks,
Junxiao.

> 
>>
>>
>>>> 2) frankly speaking, this feature will probably bring a second corruption 
>> if there is some error in the code, I do not suggest to use automatically fix 
>> by default in the first version.
>>> I think if this feature could bring more corruption, then this should be
>>> fixed first.
>>
>> Btw, I am pretty sure that Gang is referring to the feature being new and
>> thus more likely to have problems. There is nothing I see in here that is
>> file system corrupting.
>> 	--Mark
>>
>>
>> --
>> Mark Fasheh
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-11-25  4:11         ` Junxiao Bi
@ 2015-11-25  5:04           ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-25  5:04 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh; +Cc: ocfs2-devel, rgoldwyn, linux-kernel

Hi Mark and Junxiao,


>>> 
> Hi Mark,
> 
> On 11/25/2015 06:16 AM, Mark Fasheh wrote:
>> Hi Junxiao,
>> 
>> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>>> Hi Gang,
>>>
>>> This is not like a right patch.
>>> First, online file check only checks inode's block number, valid flag,
>>> fs generation value, and meta ecc. I never see a real corruption
>>> happened only on this field, if these fields are corrupted, that means
>>> something bad may happen on other place. So fix this field may not help
>>> and even cause corruption more hard.
>> 
>> I agree that these are rather uncommon, we might even consider removing the
>> VALID_FL fixup. I definitely don't think we're ready for anything more
>> complicated than this though either. We kind of have to start somewhere too.
>> 
> Yes, the fix is too simple, and just a start, I think we'd better wait
> more useful parts done before merging it.
I agree, just remark VALID_FL flag to fix this field is too simple, we should delay this field fix before 
I have a flawless solution, I will remove these lines code in the first version patches. In the future submits,
I also hope your guys to help review the code carefully, shout out your comments when you doubt somewhere.



>> 
>>> Second, the repair way is wrong. In
>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>> match the ones in memory, the ones in memory are used to update the disk
>>> fields. The question is how do you know these field in memory are
>>> right(they may be the real corrupted ones)?
>> 
>> Your second point (and the last part of your 1st point) makes a good
>> argument for why this shouldn't happen automatically. Some of these
>> corruptions might require a human to look at the log and decide what to do.
>> Especially as you point out, where we might not know where the source of the
>> corruption is. And if the human can't figure it out, then it's probably time
>> to unmount and fsck.
> The point is that the fix way is wrong, just flush memory info to disk
> is not right. I agree online fsck is good feature, but need carefully
> design, it should not involve more corruptions. A rough idea from mine
> is that maybe we need some "frezee" mechanism in fs, which can hung all
> fs op and let fs stop at a safe area. After freeze fs, we can do some
> fsck work on it and these works should not cost lots time. What's your idea?
If we need to touch some global data structures, freezing fs can be considered when we can't
get any way in case using the locks.
If we only handle some independent problem, we just need to lock the related data structures. 

> 
> Thanks,
> Junxiao.
> 
>> 
>> Thanks,
>> 	--Mark
>> 
>> --
>> Mark Fasheh
>> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-25  5:04           ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-25  5:04 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh; +Cc: ocfs2-devel, rgoldwyn, linux-kernel

Hi Mark and Junxiao,


>>> 
> Hi Mark,
> 
> On 11/25/2015 06:16 AM, Mark Fasheh wrote:
>> Hi Junxiao,
>> 
>> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>>> Hi Gang,
>>>
>>> This is not like a right patch.
>>> First, online file check only checks inode's block number, valid flag,
>>> fs generation value, and meta ecc. I never see a real corruption
>>> happened only on this field, if these fields are corrupted, that means
>>> something bad may happen on other place. So fix this field may not help
>>> and even cause corruption more hard.
>> 
>> I agree that these are rather uncommon, we might even consider removing the
>> VALID_FL fixup. I definitely don't think we're ready for anything more
>> complicated than this though either. We kind of have to start somewhere too.
>> 
> Yes, the fix is too simple, and just a start, I think we'd better wait
> more useful parts done before merging it.
I agree, just remark VALID_FL flag to fix this field is too simple, we should delay this field fix before 
I have a flawless solution, I will remove these lines code in the first version patches. In the future submits,
I also hope your guys to help review the code carefully, shout out your comments when you doubt somewhere.



>> 
>>> Second, the repair way is wrong. In
>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>> match the ones in memory, the ones in memory are used to update the disk
>>> fields. The question is how do you know these field in memory are
>>> right(they may be the real corrupted ones)?
>> 
>> Your second point (and the last part of your 1st point) makes a good
>> argument for why this shouldn't happen automatically. Some of these
>> corruptions might require a human to look at the log and decide what to do.
>> Especially as you point out, where we might not know where the source of the
>> corruption is. And if the human can't figure it out, then it's probably time
>> to unmount and fsck.
> The point is that the fix way is wrong, just flush memory info to disk
> is not right. I agree online fsck is good feature, but need carefully
> design, it should not involve more corruptions. A rough idea from mine
> is that maybe we need some "frezee" mechanism in fs, which can hung all
> fs op and let fs stop at a safe area. After freeze fs, we can do some
> fsck work on it and these works should not cost lots time. What's your idea?
If we need to touch some global data structures, freezing fs can be considered when we can't
get any way in case using the locks.
If we only handle some independent problem, we just need to lock the related data structures. 

> 
> Thanks,
> Junxiao.
> 
>> 
>> Thanks,
>> 	--Mark
>> 
>> --
>> Mark Fasheh
>> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-25  4:43               ` [Ocfs2-devel] " Junxiao Bi
@ 2015-11-25  5:11                 ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-25  5:11 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh; +Cc: akpm, ocfs2-devel, rgoldwyn, linux-kernel

Hi Junxiao,


>>> 
> Hi Gang,
> 
> On 11/25/2015 11:29 AM, Gang He wrote:
>> Hi Mark and Junxiao,
>> 
>> 
>>>>>
>>> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>>>> Hi Gang,
>>>>
>>>> On 11/03/2015 03:54 PM, Gang He wrote:
>>>>> Hi Junxiao,
>>>>>
>>>>> Thank for your reviewing.
>>>>> Current design, we use a sysfile as a interface to check/fix a file (via 
>>> pass a ino number).
>>>>> But, this operation is manually triggered by user, instead of automatically 
>>>  fix in the kernel.
>>>>> Why?
>>>>> 1) we should let users make this decision, since some users do not want to 
>>> fix when encountering a file system corruption, maybe they want to keep the 
>>> file system unchanged for a further investigation.
>>>> If user don't want this, they should not use error=continue option, let
>>>> fs go after a corruption is very dangerous.
>>>
>>> Maybe we need another errors=XXX flag (maybe errors=fix)?
>>>
>>> You both make good points, here's what I gather from the conversation:
>>>
>>>  - Some customers would be sad if they have to manually fix corruptions.
>>>    This takes effort on their part, and if the FS can handle it
>>>    automatically, it should.
>>>
>>>  - There are valid concerns that automatically fixing things is a change in
>>>    behavior that might not be welcome, or worse might lead to unforseeable
>>>    circumstances.
>>>
>>>  - I will add that fixing things automatically implies checking them
>>>    automatically which could introduce some performance impact depending on
>>>    how much checking we're doing.
>>>
>>> So if the user wants errors to be fixed automatically, they could mount with
>>> errros=fix, and everyone else would have no change in behavior unless they
>>> wanted to make use of the new feature.
>> That is what I want to say, add a mount option to let users to decide. Here, 
> I want to split "error=fix"
>> mount option  task out from online file check feature, I think this part 
> should be a independent feature.
>> We can implement this feature after online file check is done, I want to 
> split the feature into some more 
>> detailed features, implement them one by one. Do you agree this point?
> With error=fix, when a possible corruption is found, online fsck will
> start to check and fix things. So this doesn't looks like a independent
> feature.
My means is, we can implement online file check by manually triage feature first, then
Add a mount option "error=fix" feature, the second feature can be implemented after
the first part is done. I want to split them into more detailed items, maybe it is more helpful
to be reviewed, but the whole feature ideas are very OK, just need to do one by one.  

> 
> Thanks,
> Junxiao.
> 
>> 
>>>
>>>
>>>>> 2) frankly speaking, this feature will probably bring a second corruption 
>>> if there is some error in the code, I do not suggest to use automatically 
> fix 
>>> by default in the first version.
>>>> I think if this feature could bring more corruption, then this should be
>>>> fixed first.
>>>
>>> Btw, I am pretty sure that Gang is referring to the feature being new and
>>> thus more likely to have problems. There is nothing I see in here that is
>>> file system corrupting.
>>> 	--Mark
>>>
>>>
>>> --
>>> Mark Fasheh
>> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-11-25  5:11                 ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-11-25  5:11 UTC (permalink / raw)
  To: Junxiao Bi, Mark Fasheh; +Cc: akpm, ocfs2-devel, rgoldwyn, linux-kernel

Hi Junxiao,


>>> 
> Hi Gang,
> 
> On 11/25/2015 11:29 AM, Gang He wrote:
>> Hi Mark and Junxiao,
>> 
>> 
>>>>>
>>> On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
>>>> Hi Gang,
>>>>
>>>> On 11/03/2015 03:54 PM, Gang He wrote:
>>>>> Hi Junxiao,
>>>>>
>>>>> Thank for your reviewing.
>>>>> Current design, we use a sysfile as a interface to check/fix a file (via 
>>> pass a ino number).
>>>>> But, this operation is manually triggered by user, instead of automatically 
>>>  fix in the kernel.
>>>>> Why?
>>>>> 1) we should let users make this decision, since some users do not want to 
>>> fix when encountering a file system corruption, maybe they want to keep the 
>>> file system unchanged for a further investigation.
>>>> If user don't want this, they should not use error=continue option, let
>>>> fs go after a corruption is very dangerous.
>>>
>>> Maybe we need another errors=XXX flag (maybe errors=fix)?
>>>
>>> You both make good points, here's what I gather from the conversation:
>>>
>>>  - Some customers would be sad if they have to manually fix corruptions.
>>>    This takes effort on their part, and if the FS can handle it
>>>    automatically, it should.
>>>
>>>  - There are valid concerns that automatically fixing things is a change in
>>>    behavior that might not be welcome, or worse might lead to unforseeable
>>>    circumstances.
>>>
>>>  - I will add that fixing things automatically implies checking them
>>>    automatically which could introduce some performance impact depending on
>>>    how much checking we're doing.
>>>
>>> So if the user wants errors to be fixed automatically, they could mount with
>>> errros=fix, and everyone else would have no change in behavior unless they
>>> wanted to make use of the new feature.
>> That is what I want to say, add a mount option to let users to decide. Here, 
> I want to split "error=fix"
>> mount option  task out from online file check feature, I think this part 
> should be a independent feature.
>> We can implement this feature after online file check is done, I want to 
> split the feature into some more 
>> detailed features, implement them one by one. Do you agree this point?
> With error=fix, when a possible corruption is found, online fsck will
> start to check and fix things. So this doesn't looks like a independent
> feature.
My means is, we can implement online file check by manually triage feature first, then
Add a mount option "error=fix" feature, the second feature can be implemented after
the first part is done. I want to split them into more detailed items, maybe it is more helpful
to be reviewed, but the whole feature ideas are very OK, just need to do one by one.  

> 
> Thanks,
> Junxiao.
> 
>> 
>>>
>>>
>>>>> 2) frankly speaking, this feature will probably bring a second corruption 
>>> if there is some error in the code, I do not suggest to use automatically 
> fix 
>>> by default in the first version.
>>>> I think if this feature could bring more corruption, then this should be
>>>> fixed first.
>>>
>>> Btw, I am pretty sure that Gang is referring to the feature being new and
>>> thus more likely to have problems. There is nothing I see in here that is
>>> file system corrupting.
>>> 	--Mark
>>>
>>>
>>> --
>>> Mark Fasheh
>> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
  2015-11-25  5:04           ` Gang He
@ 2015-11-25  5:44             ` Junxiao Bi
  -1 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-25  5:44 UTC (permalink / raw)
  To: Gang He, Mark Fasheh; +Cc: ocfs2-devel, rgoldwyn, linux-kernel

On 11/25/2015 01:04 PM, Gang He wrote:
> Hi Mark and Junxiao,
> 
> 
>>>>
>> Hi Mark,
>>
>> On 11/25/2015 06:16 AM, Mark Fasheh wrote:
>>> Hi Junxiao,
>>>
>>> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>>>> Hi Gang,
>>>>
>>>> This is not like a right patch.
>>>> First, online file check only checks inode's block number, valid flag,
>>>> fs generation value, and meta ecc. I never see a real corruption
>>>> happened only on this field, if these fields are corrupted, that means
>>>> something bad may happen on other place. So fix this field may not help
>>>> and even cause corruption more hard.
>>>
>>> I agree that these are rather uncommon, we might even consider removing the
>>> VALID_FL fixup. I definitely don't think we're ready for anything more
>>> complicated than this though either. We kind of have to start somewhere too.
>>>
>> Yes, the fix is too simple, and just a start, I think we'd better wait
>> more useful parts done before merging it.
> I agree, just remark VALID_FL flag to fix this field is too simple, we should delay this field fix before 
> I have a flawless solution, I will remove these lines code in the first version patches. In the future submits,
> I also hope your guys to help review the code carefully, shout out your comments when you doubt somewhere.
Sure.

> 
> 
> 
>>>
>>>> Second, the repair way is wrong. In
>>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>>> match the ones in memory, the ones in memory are used to update the disk
>>>> fields. The question is how do you know these field in memory are
>>>> right(they may be the real corrupted ones)?
>>>
>>> Your second point (and the last part of your 1st point) makes a good
>>> argument for why this shouldn't happen automatically. Some of these
>>> corruptions might require a human to look at the log and decide what to do.
>>> Especially as you point out, where we might not know where the source of the
>>> corruption is. And if the human can't figure it out, then it's probably time
>>> to unmount and fsck.
>> The point is that the fix way is wrong, just flush memory info to disk
>> is not right. I agree online fsck is good feature, but need carefully
>> design, it should not involve more corruptions. A rough idea from mine
>> is that maybe we need some "frezee" mechanism in fs, which can hung all
>> fs op and let fs stop at a safe area. After freeze fs, we can do some
>> fsck work on it and these works should not cost lots time. What's your idea?
> If we need to touch some global data structures, freezing fs can be considered when we can't
> get any way in case using the locks.
> If we only handle some independent problem, we just need to lock the related data structures. 
Hmm, I am not sure whether it's hard to decide an independent issue.

Thanks,
Junxiao.
> 
>>
>> Thanks,
>> Junxiao.
>>
>>>
>>> Thanks,
>>> 	--Mark
>>>
>>> --
>>> Mark Fasheh
>>>
> 


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/4] ocfs2: check/fix inode block for online file check
@ 2015-11-25  5:44             ` Junxiao Bi
  0 siblings, 0 replies; 80+ messages in thread
From: Junxiao Bi @ 2015-11-25  5:44 UTC (permalink / raw)
  To: Gang He, Mark Fasheh; +Cc: ocfs2-devel, rgoldwyn, linux-kernel

On 11/25/2015 01:04 PM, Gang He wrote:
> Hi Mark and Junxiao,
> 
> 
>>>>
>> Hi Mark,
>>
>> On 11/25/2015 06:16 AM, Mark Fasheh wrote:
>>> Hi Junxiao,
>>>
>>> On Tue, Nov 03, 2015 at 03:12:35PM +0800, Junxiao Bi wrote:
>>>> Hi Gang,
>>>>
>>>> This is not like a right patch.
>>>> First, online file check only checks inode's block number, valid flag,
>>>> fs generation value, and meta ecc. I never see a real corruption
>>>> happened only on this field, if these fields are corrupted, that means
>>>> something bad may happen on other place. So fix this field may not help
>>>> and even cause corruption more hard.
>>>
>>> I agree that these are rather uncommon, we might even consider removing the
>>> VALID_FL fixup. I definitely don't think we're ready for anything more
>>> complicated than this though either. We kind of have to start somewhere too.
>>>
>> Yes, the fix is too simple, and just a start, I think we'd better wait
>> more useful parts done before merging it.
> I agree, just remark VALID_FL flag to fix this field is too simple, we should delay this field fix before 
> I have a flawless solution, I will remove these lines code in the first version patches. In the future submits,
> I also hope your guys to help review the code carefully, shout out your comments when you doubt somewhere.
Sure.

> 
> 
> 
>>>
>>>> Second, the repair way is wrong. In
>>>> ocfs2_filecheck_repair_inode_block(), if these fields in disk don't
>>>> match the ones in memory, the ones in memory are used to update the disk
>>>> fields. The question is how do you know these field in memory are
>>>> right(they may be the real corrupted ones)?
>>>
>>> Your second point (and the last part of your 1st point) makes a good
>>> argument for why this shouldn't happen automatically. Some of these
>>> corruptions might require a human to look at the log and decide what to do.
>>> Especially as you point out, where we might not know where the source of the
>>> corruption is. And if the human can't figure it out, then it's probably time
>>> to unmount and fsck.
>> The point is that the fix way is wrong, just flush memory info to disk
>> is not right. I agree online fsck is good feature, but need carefully
>> design, it should not involve more corruptions. A rough idea from mine
>> is that maybe we need some "frezee" mechanism in fs, which can hung all
>> fs op and let fs stop at a safe area. After freeze fs, we can do some
>> fsck work on it and these works should not cost lots time. What's your idea?
> If we need to touch some global data structures, freezing fs can be considered when we can't
> get any way in case using the locks.
> If we only handle some independent problem, we just need to lock the related data structures. 
Hmm, I am not sure whether it's hard to decide an independent issue.

Thanks,
Junxiao.
> 
>>
>> Thanks,
>> Junxiao.
>>
>>>
>>> Thanks,
>>> 	--Mark
>>>
>>> --
>>> Mark Fasheh
>>>
> 

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 0/4] Add online file check feature
  2015-10-28  6:25 ` [Ocfs2-devel] " Gang He
@ 2015-12-02 18:20   ` Pavel Machek
  -1 siblings, 0 replies; 80+ messages in thread
From: Pavel Machek @ 2015-12-02 18:20 UTC (permalink / raw)
  To: Gang He; +Cc: mfasheh, rgoldwyn, linux-kernel, ocfs2-devel, akpm, Greg KH

On Wed 2015-10-28 14:25:57, Gang He wrote:
> When there are errors in the ocfs2 filesystem,
> they are usually accompanied by the inode number which caused the error.
> This inode number would be the input to fixing the file.
> One of these options could be considered:
> A file in the sys filesytem which would accept inode numbers.
> This could be used to communication back what has to be fixed or is fixed.
> You could write:
> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> or
> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
> 

Are you sure this is reasonable interface? I mean.... sysfs is
supposed to be one value per file. And I don't think its suitable for
running commands.

...or returning back results.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-12-02 18:20   ` Pavel Machek
  0 siblings, 0 replies; 80+ messages in thread
From: Pavel Machek @ 2015-12-02 18:20 UTC (permalink / raw)
  To: Gang He; +Cc: mfasheh, rgoldwyn, linux-kernel, ocfs2-devel, akpm, Greg KH

On Wed 2015-10-28 14:25:57, Gang He wrote:
> When there are errors in the ocfs2 filesystem,
> they are usually accompanied by the inode number which caused the error.
> This inode number would be the input to fixing the file.
> One of these options could be considered:
> A file in the sys filesytem which would accept inode numbers.
> This could be used to communication back what has to be fixed or is fixed.
> You could write:
> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> or
> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
> 

Are you sure this is reasonable interface? I mean.... sysfs is
supposed to be one value per file. And I don't think its suitable for
running commands.

...or returning back results.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 0/4] Add online file check feature
  2015-12-02 18:20   ` [Ocfs2-devel] " Pavel Machek
@ 2015-12-03  2:05     ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-12-03  2:05 UTC (permalink / raw)
  To: pavel; +Cc: greg, akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, linux-kernel

Hello Pavel,



>>> 
> On Wed 2015-10-28 14:25:57, Gang He wrote:
>> When there are errors in the ocfs2 filesystem,
>> they are usually accompanied by the inode number which caused the error.
>> This inode number would be the input to fixing the file.
>> One of these options could be considered:
>> A file in the sys filesytem which would accept inode numbers.
>> This could be used to communication back what has to be fixed or is fixed.
>> You could write:
>> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>> or
>> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>> 
> 
> Are you sure this is reasonable interface? I mean.... sysfs is
> supposed to be one value per file. And I don't think its suitable for
> running commands.
Usually, the corrupted file (inode) should be rarely encountered for OCFS2 file system, then
lots of commands are executed via this interface with high performance is not expected by us.
Second, after online file check is added, we also plan to add a mount option "error=fix", that means
the file system can fix these errors automatically without a manual command triggering.

Thanks
Gang

> 
> ...or returning back results.
> 									Pavel
> -- 
> (english) http://www.livejournal.com/~pavelmachek 
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html 
> .


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-12-03  2:05     ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-12-03  2:05 UTC (permalink / raw)
  To: pavel; +Cc: greg, akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, linux-kernel

Hello Pavel,



>>> 
> On Wed 2015-10-28 14:25:57, Gang He wrote:
>> When there are errors in the ocfs2 filesystem,
>> they are usually accompanied by the inode number which caused the error.
>> This inode number would be the input to fixing the file.
>> One of these options could be considered:
>> A file in the sys filesytem which would accept inode numbers.
>> This could be used to communication back what has to be fixed or is fixed.
>> You could write:
>> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>> or
>> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>> 
> 
> Are you sure this is reasonable interface? I mean.... sysfs is
> supposed to be one value per file. And I don't think its suitable for
> running commands.
Usually, the corrupted file (inode) should be rarely encountered for OCFS2 file system, then
lots of commands are executed via this interface with high performance is not expected by us.
Second, after online file check is added, we also plan to add a mount option "error=fix", that means
the file system can fix these errors automatically without a manual command triggering.

Thanks
Gang

> 
> ...or returning back results.
> 									Pavel
> -- 
> (english) http://www.livejournal.com/~pavelmachek 
> (cesky, pictures) 
> http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html 
> .

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 0/4] Add online file check feature
  2015-12-03  2:05     ` [Ocfs2-devel] " Gang He
@ 2015-12-03  5:17       ` Greg KH
  -1 siblings, 0 replies; 80+ messages in thread
From: Greg KH @ 2015-12-03  5:17 UTC (permalink / raw)
  To: Gang He; +Cc: pavel, akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, linux-kernel

On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
> Hello Pavel,
> 
> 
> 
> >>> 
> > On Wed 2015-10-28 14:25:57, Gang He wrote:
> >> When there are errors in the ocfs2 filesystem,
> >> they are usually accompanied by the inode number which caused the error.
> >> This inode number would be the input to fixing the file.
> >> One of these options could be considered:
> >> A file in the sys filesytem which would accept inode numbers.
> >> This could be used to communication back what has to be fixed or is fixed.
> >> You could write:
> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> or
> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> 
> > 
> > Are you sure this is reasonable interface? I mean.... sysfs is
> > supposed to be one value per file. And I don't think its suitable for
> > running commands.
> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 file system, then
> lots of commands are executed via this interface with high performance is not expected by us.
> Second, after online file check is added, we also plan to add a mount option "error=fix", that means
> the file system can fix these errors automatically without a manual command triggering.

It's not a "performance" issue, it's a "sysfs files only have one value"
type thing.  Have two files, "inode_fix" and "inode_check" and then just
write the inode into them, no need to have a "verb <inode>" type parser.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-12-03  5:17       ` Greg KH
  0 siblings, 0 replies; 80+ messages in thread
From: Greg KH @ 2015-12-03  5:17 UTC (permalink / raw)
  To: Gang He; +Cc: pavel, akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, linux-kernel

On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
> Hello Pavel,
> 
> 
> 
> >>> 
> > On Wed 2015-10-28 14:25:57, Gang He wrote:
> >> When there are errors in the ocfs2 filesystem,
> >> they are usually accompanied by the inode number which caused the error.
> >> This inode number would be the input to fixing the file.
> >> One of these options could be considered:
> >> A file in the sys filesytem which would accept inode numbers.
> >> This could be used to communication back what has to be fixed or is fixed.
> >> You could write:
> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> or
> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> 
> > 
> > Are you sure this is reasonable interface? I mean.... sysfs is
> > supposed to be one value per file. And I don't think its suitable for
> > running commands.
> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 file system, then
> lots of commands are executed via this interface with high performance is not expected by us.
> Second, after online file check is added, we also plan to add a mount option "error=fix", that means
> the file system can fix these errors automatically without a manual command triggering.

It's not a "performance" issue, it's a "sysfs files only have one value"
type thing.  Have two files, "inode_fix" and "inode_check" and then just
write the inode into them, no need to have a "verb <inode>" type parser.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 0/4] Add online file check feature
  2015-12-03  5:17       ` [Ocfs2-devel] " Greg KH
@ 2015-12-04  8:36         ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-12-04  8:36 UTC (permalink / raw)
  To: greg; +Cc: akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, pavel, linux-kernel

Hi Greg,


>>> 
> On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
>> Hello Pavel,
>> 
>> 
>> 
>> >>> 
>> > On Wed 2015-10-28 14:25:57, Gang He wrote:
>> >> When there are errors in the ocfs2 filesystem,
>> >> they are usually accompanied by the inode number which caused the error.
>> >> This inode number would be the input to fixing the file.
>> >> One of these options could be considered:
>> >> A file in the sys filesytem which would accept inode numbers.
>> >> This could be used to communication back what has to be fixed or is fixed.
>> >> You could write:
>> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>> >> or
>> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>> >> 
>> > 
>> > Are you sure this is reasonable interface? I mean.... sysfs is
>> > supposed to be one value per file. And I don't think its suitable for
>> > running commands.
>> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 
> file system, then
>> lots of commands are executed via this interface with high performance is 
> not expected by us.
>> Second, after online file check is added, we also plan to add a mount option 
> "error=fix", that means
>> the file system can fix these errors automatically without a manual command 
> triggering.
> 
> It's not a "performance" issue, it's a "sysfs files only have one value"
> type thing.  Have two files, "inode_fix" and "inode_check" and then just
> write the inode into them, no need to have a "verb <inode>" type parser.
Current, we have three functional items "check, fix and set", in the future, maybe we can add more item.
Then, for each functional item, we need to create a sys file and add related code (actual some code is duplicated),
I prefer to one sys file to handle multiple sub-commands.

Thanks
Gang

> 
> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-12-04  8:36         ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-12-04  8:36 UTC (permalink / raw)
  To: greg; +Cc: akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, pavel, linux-kernel

Hi Greg,


>>> 
> On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
>> Hello Pavel,
>> 
>> 
>> 
>> >>> 
>> > On Wed 2015-10-28 14:25:57, Gang He wrote:
>> >> When there are errors in the ocfs2 filesystem,
>> >> they are usually accompanied by the inode number which caused the error.
>> >> This inode number would be the input to fixing the file.
>> >> One of these options could be considered:
>> >> A file in the sys filesytem which would accept inode numbers.
>> >> This could be used to communication back what has to be fixed or is fixed.
>> >> You could write:
>> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>> >> or
>> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>> >> 
>> > 
>> > Are you sure this is reasonable interface? I mean.... sysfs is
>> > supposed to be one value per file. And I don't think its suitable for
>> > running commands.
>> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 
> file system, then
>> lots of commands are executed via this interface with high performance is 
> not expected by us.
>> Second, after online file check is added, we also plan to add a mount option 
> "error=fix", that means
>> the file system can fix these errors automatically without a manual command 
> triggering.
> 
> It's not a "performance" issue, it's a "sysfs files only have one value"
> type thing.  Have two files, "inode_fix" and "inode_check" and then just
> write the inode into them, no need to have a "verb <inode>" type parser.
Current, we have three functional items "check, fix and set", in the future, maybe we can add more item.
Then, for each functional item, we need to create a sys file and add related code (actual some code is duplicated),
I prefer to one sys file to handle multiple sub-commands.

Thanks
Gang

> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 0/4] Add online file check feature
  2015-12-04  8:36         ` [Ocfs2-devel] " Gang He
@ 2015-12-04  9:20           ` Pavel Machek
  -1 siblings, 0 replies; 80+ messages in thread
From: Pavel Machek @ 2015-12-04  9:20 UTC (permalink / raw)
  To: Gang He; +Cc: greg, akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, linux-kernel

On Fri 2015-12-04 01:36:21, Gang He wrote:
> Hi Greg,
> 
> 
> >>> 
> > On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
> >> Hello Pavel,
> >> 
> >> 
> >> 
> >> >>> 
> >> > On Wed 2015-10-28 14:25:57, Gang He wrote:
> >> >> When there are errors in the ocfs2 filesystem,
> >> >> they are usually accompanied by the inode number which caused the error.
> >> >> This inode number would be the input to fixing the file.
> >> >> One of these options could be considered:
> >> >> A file in the sys filesytem which would accept inode numbers.
> >> >> This could be used to communication back what has to be fixed or is fixed.
> >> >> You could write:
> >> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> >> or
> >> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> >> 
> >> > 
> >> > Are you sure this is reasonable interface? I mean.... sysfs is
> >> > supposed to be one value per file. And I don't think its suitable for
> >> > running commands.
> >> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 
> > file system, then
> >> lots of commands are executed via this interface with high performance is 
> > not expected by us.
> >> Second, after online file check is added, we also plan to add a mount option 
> > "error=fix", that means
> >> the file system can fix these errors automatically without a manual command 
> > triggering.
> > 
> > It's not a "performance" issue, it's a "sysfs files only have one value"
> > type thing.  Have two files, "inode_fix" and "inode_check" and then just
> > write the inode into them, no need to have a "verb <inode>" type parser.
> Current, we have three functional items "check, fix and set", in the future, maybe we can add more item.
> Then, for each functional item, we need to create a sys file and add related code (actual some code is duplicated),
> I prefer to one sys file to handle multiple sub-commands.

And we prefer not to have your code in tree.

Please design some reasonable interface. Abusing sysfs for this is not
right.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-12-04  9:20           ` Pavel Machek
  0 siblings, 0 replies; 80+ messages in thread
From: Pavel Machek @ 2015-12-04  9:20 UTC (permalink / raw)
  To: Gang He; +Cc: greg, akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, linux-kernel

On Fri 2015-12-04 01:36:21, Gang He wrote:
> Hi Greg,
> 
> 
> >>> 
> > On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
> >> Hello Pavel,
> >> 
> >> 
> >> 
> >> >>> 
> >> > On Wed 2015-10-28 14:25:57, Gang He wrote:
> >> >> When there are errors in the ocfs2 filesystem,
> >> >> they are usually accompanied by the inode number which caused the error.
> >> >> This inode number would be the input to fixing the file.
> >> >> One of these options could be considered:
> >> >> A file in the sys filesytem which would accept inode numbers.
> >> >> This could be used to communication back what has to be fixed or is fixed.
> >> >> You could write:
> >> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> >> or
> >> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> >> 
> >> > 
> >> > Are you sure this is reasonable interface? I mean.... sysfs is
> >> > supposed to be one value per file. And I don't think its suitable for
> >> > running commands.
> >> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 
> > file system, then
> >> lots of commands are executed via this interface with high performance is 
> > not expected by us.
> >> Second, after online file check is added, we also plan to add a mount option 
> > "error=fix", that means
> >> the file system can fix these errors automatically without a manual command 
> > triggering.
> > 
> > It's not a "performance" issue, it's a "sysfs files only have one value"
> > type thing.  Have two files, "inode_fix" and "inode_check" and then just
> > write the inode into them, no need to have a "verb <inode>" type parser.
> Current, we have three functional items "check, fix and set", in the future, maybe we can add more item.
> Then, for each functional item, we need to create a sys file and add related code (actual some code is duplicated),
> I prefer to one sys file to handle multiple sub-commands.

And we prefer not to have your code in tree.

Please design some reasonable interface. Abusing sysfs for this is not
right.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 0/4] Add online file check feature
  2015-12-04  8:36         ` [Ocfs2-devel] " Gang He
@ 2015-12-04 16:40           ` Greg KH
  -1 siblings, 0 replies; 80+ messages in thread
From: Greg KH @ 2015-12-04 16:40 UTC (permalink / raw)
  To: Gang He; +Cc: akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, pavel, linux-kernel

On Fri, Dec 04, 2015 at 01:36:21AM -0700, Gang He wrote:
> Hi Greg,
> 
> 
> >>> 
> > On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
> >> Hello Pavel,
> >> 
> >> 
> >> 
> >> >>> 
> >> > On Wed 2015-10-28 14:25:57, Gang He wrote:
> >> >> When there are errors in the ocfs2 filesystem,
> >> >> they are usually accompanied by the inode number which caused the error.
> >> >> This inode number would be the input to fixing the file.
> >> >> One of these options could be considered:
> >> >> A file in the sys filesytem which would accept inode numbers.
> >> >> This could be used to communication back what has to be fixed or is fixed.
> >> >> You could write:
> >> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> >> or
> >> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> >> 
> >> > 
> >> > Are you sure this is reasonable interface? I mean.... sysfs is
> >> > supposed to be one value per file. And I don't think its suitable for
> >> > running commands.
> >> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 
> > file system, then
> >> lots of commands are executed via this interface with high performance is 
> > not expected by us.
> >> Second, after online file check is added, we also plan to add a mount option 
> > "error=fix", that means
> >> the file system can fix these errors automatically without a manual command 
> > triggering.
> > 
> > It's not a "performance" issue, it's a "sysfs files only have one value"
> > type thing.  Have two files, "inode_fix" and "inode_check" and then just
> > write the inode into them, no need to have a "verb <inode>" type parser.
> Current, we have three functional items "check, fix and set", in the future, maybe we can add more item.
> Then, for each functional item, we need to create a sys file and add related code (actual some code is duplicated),
> I prefer to one sys file to handle multiple sub-commands.

No, sorry, that is not how sysfs works.  Please use individual files,
again, sysfs is "one value per file" you should never have to write a
"parser" for a sysfs file either reading, or writing to it.

If you need additional things in the future, great, add new sysfs files,
that makes it the easiest way for your userspace tools to be able to
determine if that feature is present in the kernel or not, it does not
have to write a command that it doesn't know if the kernel can handle or
not.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-12-04 16:40           ` Greg KH
  0 siblings, 0 replies; 80+ messages in thread
From: Greg KH @ 2015-12-04 16:40 UTC (permalink / raw)
  To: Gang He; +Cc: akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, pavel, linux-kernel

On Fri, Dec 04, 2015 at 01:36:21AM -0700, Gang He wrote:
> Hi Greg,
> 
> 
> >>> 
> > On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
> >> Hello Pavel,
> >> 
> >> 
> >> 
> >> >>> 
> >> > On Wed 2015-10-28 14:25:57, Gang He wrote:
> >> >> When there are errors in the ocfs2 filesystem,
> >> >> they are usually accompanied by the inode number which caused the error.
> >> >> This inode number would be the input to fixing the file.
> >> >> One of these options could be considered:
> >> >> A file in the sys filesytem which would accept inode numbers.
> >> >> This could be used to communication back what has to be fixed or is fixed.
> >> >> You could write:
> >> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> >> or
> >> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
> >> >> 
> >> > 
> >> > Are you sure this is reasonable interface? I mean.... sysfs is
> >> > supposed to be one value per file. And I don't think its suitable for
> >> > running commands.
> >> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 
> > file system, then
> >> lots of commands are executed via this interface with high performance is 
> > not expected by us.
> >> Second, after online file check is added, we also plan to add a mount option 
> > "error=fix", that means
> >> the file system can fix these errors automatically without a manual command 
> > triggering.
> > 
> > It's not a "performance" issue, it's a "sysfs files only have one value"
> > type thing.  Have two files, "inode_fix" and "inode_check" and then just
> > write the inode into them, no need to have a "verb <inode>" type parser.
> Current, we have three functional items "check, fix and set", in the future, maybe we can add more item.
> Then, for each functional item, we need to create a sys file and add related code (actual some code is duplicated),
> I prefer to one sys file to handle multiple sub-commands.

No, sorry, that is not how sysfs works.  Please use individual files,
again, sysfs is "one value per file" you should never have to write a
"parser" for a sysfs file either reading, or writing to it.

If you need additional things in the future, great, add new sysfs files,
that makes it the easiest way for your userspace tools to be able to
determine if that feature is present in the kernel or not, it does not
have to write a command that it doesn't know if the kernel can handle or
not.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 0/4] Add online file check feature
  2015-12-04 16:40           ` [Ocfs2-devel] " Greg KH
@ 2015-12-07  3:33             ` Gang He
  -1 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-12-07  3:33 UTC (permalink / raw)
  To: greg; +Cc: akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, pavel, linux-kernel

Hello Greg and Pavel,

Sorry, there was a misunderstand, I was not aware that there were some design constraints for sysfs interfaces. I will review and modify this portion code. 


Thanks
Gang 


>>> 
> On Fri, Dec 04, 2015 at 01:36:21AM -0700, Gang He wrote:
>> Hi Greg,
>> 
>> 
>> >>> 
>> > On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
>> >> Hello Pavel,
>> >> 
>> >> 
>> >> 
>> >> >>> 
>> >> > On Wed 2015-10-28 14:25:57, Gang He wrote:
>> >> >> When there are errors in the ocfs2 filesystem,
>> >> >> they are usually accompanied by the inode number which caused the error.
>> >> >> This inode number would be the input to fixing the file.
>> >> >> One of these options could be considered:
>> >> >> A file in the sys filesytem which would accept inode numbers.
>> >> >> This could be used to communication back what has to be fixed or is fixed.
>> >> >> You could write:
>> >> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>> >> >> or
>> >> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>> >> >> 
>> >> > 
>> >> > Are you sure this is reasonable interface? I mean.... sysfs is
>> >> > supposed to be one value per file. And I don't think its suitable for
>> >> > running commands.
>> >> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 
>> > file system, then
>> >> lots of commands are executed via this interface with high performance is 
>> > not expected by us.
>> >> Second, after online file check is added, we also plan to add a mount 
> option 
>> > "error=fix", that means
>> >> the file system can fix these errors automatically without a manual command 
> 
>> > triggering.
>> > 
>> > It's not a "performance" issue, it's a "sysfs files only have one value"
>> > type thing.  Have two files, "inode_fix" and "inode_check" and then just
>> > write the inode into them, no need to have a "verb <inode>" type parser.
>> Current, we have three functional items "check, fix and set", in the future, 
> maybe we can add more item.
>> Then, for each functional item, we need to create a sys file and add related 
> code (actual some code is duplicated),
>> I prefer to one sys file to handle multiple sub-commands.
> 
> No, sorry, that is not how sysfs works.  Please use individual files,
> again, sysfs is "one value per file" you should never have to write a
> "parser" for a sysfs file either reading, or writing to it.
> 
> If you need additional things in the future, great, add new sysfs files,
> that makes it the easiest way for your userspace tools to be able to
> determine if that feature is present in the kernel or not, it does not
> have to write a command that it doesn't know if the kernel can handle or
> not.
> 
> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature
@ 2015-12-07  3:33             ` Gang He
  0 siblings, 0 replies; 80+ messages in thread
From: Gang He @ 2015-12-07  3:33 UTC (permalink / raw)
  To: greg; +Cc: akpm, ocfs2-devel, Mark Fasheh, rgoldwyn, pavel, linux-kernel

Hello Greg and Pavel,

Sorry, there was a misunderstand, I was not aware that there were some design constraints for sysfs interfaces. I will review and modify this portion code. 


Thanks
Gang 


>>> 
> On Fri, Dec 04, 2015 at 01:36:21AM -0700, Gang He wrote:
>> Hi Greg,
>> 
>> 
>> >>> 
>> > On Wed, Dec 02, 2015 at 07:05:27PM -0700, Gang He wrote:
>> >> Hello Pavel,
>> >> 
>> >> 
>> >> 
>> >> >>> 
>> >> > On Wed 2015-10-28 14:25:57, Gang He wrote:
>> >> >> When there are errors in the ocfs2 filesystem,
>> >> >> they are usually accompanied by the inode number which caused the error.
>> >> >> This inode number would be the input to fixing the file.
>> >> >> One of these options could be considered:
>> >> >> A file in the sys filesytem which would accept inode numbers.
>> >> >> This could be used to communication back what has to be fixed or is fixed.
>> >> >> You could write:
>> >> >> $# echo "CHECK <inode>" > /sys/fs/ocfs2/devname/filecheck
>> >> >> or
>> >> >> $# echo "FIX <inode>" > /sys/fs/ocfs2/devname/filecheck
>> >> >> 
>> >> > 
>> >> > Are you sure this is reasonable interface? I mean.... sysfs is
>> >> > supposed to be one value per file. And I don't think its suitable for
>> >> > running commands.
>> >> Usually, the corrupted file (inode) should be rarely encountered for OCFS2 
>> > file system, then
>> >> lots of commands are executed via this interface with high performance is 
>> > not expected by us.
>> >> Second, after online file check is added, we also plan to add a mount 
> option 
>> > "error=fix", that means
>> >> the file system can fix these errors automatically without a manual command 
> 
>> > triggering.
>> > 
>> > It's not a "performance" issue, it's a "sysfs files only have one value"
>> > type thing.  Have two files, "inode_fix" and "inode_check" and then just
>> > write the inode into them, no need to have a "verb <inode>" type parser.
>> Current, we have three functional items "check, fix and set", in the future, 
> maybe we can add more item.
>> Then, for each functional item, we need to create a sys file and add related 
> code (actual some code is duplicated),
>> I prefer to one sys file to handle multiple sub-commands.
> 
> No, sorry, that is not how sysfs works.  Please use individual files,
> again, sysfs is "one value per file" you should never have to write a
> "parser" for a sysfs file either reading, or writing to it.
> 
> If you need additional things in the future, great, add new sysfs files,
> that makes it the easiest way for your userspace tools to be able to
> determine if that feature is present in the kernel or not, it does not
> have to write a command that it doesn't know if the kernel can handle or
> not.
> 
> thanks,
> 
> greg k-h

^ permalink raw reply	[flat|nested] 80+ messages in thread

* Re: [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
  2015-11-25  3:29             ` [Ocfs2-devel] " Gang He
@ 2015-12-18 22:37               ` Mark Fasheh
  -1 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-12-18 22:37 UTC (permalink / raw)
  To: Gang He; +Cc: Junxiao Bi, akpm, ocfs2-devel, rgoldwyn, linux-kernel

On Tue, Nov 24, 2015 at 08:29:41PM -0700, Gang He wrote:
> Hi Mark and Junxiao,
> 
> 
> >>> 
> > On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
> >> Hi Gang,
> >> 
> >> On 11/03/2015 03:54 PM, Gang He wrote:
> >> > Hi Junxiao,
> >> > 
> >> > Thank for your reviewing.
> >> > Current design, we use a sysfile as a interface to check/fix a file (via 
> > pass a ino number).
> >> > But, this operation is manually triggered by user, instead of automatically 
> >  fix in the kernel.
> >> > Why?
> >> > 1) we should let users make this decision, since some users do not want to 
> > fix when encountering a file system corruption, maybe they want to keep the 
> > file system unchanged for a further investigation.
> >> If user don't want this, they should not use error=continue option, let
> >> fs go after a corruption is very dangerous.
> > 
> > Maybe we need another errors=XXX flag (maybe errors=fix)?
> > 
> > You both make good points, here's what I gather from the conversation:
> > 
> >  - Some customers would be sad if they have to manually fix corruptions.
> >    This takes effort on their part, and if the FS can handle it
> >    automatically, it should.
> > 
> >  - There are valid concerns that automatically fixing things is a change in
> >    behavior that might not be welcome, or worse might lead to unforseeable
> >    circumstances.
> > 
> >  - I will add that fixing things automatically implies checking them
> >    automatically which could introduce some performance impact depending on
> >    how much checking we're doing.
> > 
> > So if the user wants errors to be fixed automatically, they could mount with
> > errros=fix, and everyone else would have no change in behavior unless they
> > wanted to make use of the new feature.
> That is what I want to say, add a mount option to let users to decide. Here, I want to split "error=fix"
> mount option  task out from online file check feature, I think this part should be a independent feature.
> We can implement this feature after online file check is done, I want to split the feature into some more 
> detailed features, implement them one by one. Do you agree this point?

Yeah that's fine, I would have automatic checking turned off though until we
have a good plan in place for users who do / don't want this.
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/4] ocfs2: sysfile interfaces for online file check
@ 2015-12-18 22:37               ` Mark Fasheh
  0 siblings, 0 replies; 80+ messages in thread
From: Mark Fasheh @ 2015-12-18 22:37 UTC (permalink / raw)
  To: Gang He; +Cc: Junxiao Bi, akpm, ocfs2-devel, rgoldwyn, linux-kernel

On Tue, Nov 24, 2015 at 08:29:41PM -0700, Gang He wrote:
> Hi Mark and Junxiao,
> 
> 
> >>> 
> > On Tue, Nov 03, 2015 at 04:20:27PM +0800, Junxiao Bi wrote:
> >> Hi Gang,
> >> 
> >> On 11/03/2015 03:54 PM, Gang He wrote:
> >> > Hi Junxiao,
> >> > 
> >> > Thank for your reviewing.
> >> > Current design, we use a sysfile as a interface to check/fix a file (via 
> > pass a ino number).
> >> > But, this operation is manually triggered by user, instead of automatically 
> >  fix in the kernel.
> >> > Why?
> >> > 1) we should let users make this decision, since some users do not want to 
> > fix when encountering a file system corruption, maybe they want to keep the 
> > file system unchanged for a further investigation.
> >> If user don't want this, they should not use error=continue option, let
> >> fs go after a corruption is very dangerous.
> > 
> > Maybe we need another errors=XXX flag (maybe errors=fix)?
> > 
> > You both make good points, here's what I gather from the conversation:
> > 
> >  - Some customers would be sad if they have to manually fix corruptions.
> >    This takes effort on their part, and if the FS can handle it
> >    automatically, it should.
> > 
> >  - There are valid concerns that automatically fixing things is a change in
> >    behavior that might not be welcome, or worse might lead to unforseeable
> >    circumstances.
> > 
> >  - I will add that fixing things automatically implies checking them
> >    automatically which could introduce some performance impact depending on
> >    how much checking we're doing.
> > 
> > So if the user wants errors to be fixed automatically, they could mount with
> > errros=fix, and everyone else would have no change in behavior unless they
> > wanted to make use of the new feature.
> That is what I want to say, add a mount option to let users to decide. Here, I want to split "error=fix"
> mount option  task out from online file check feature, I think this part should be a independent feature.
> We can implement this feature after online file check is done, I want to split the feature into some more 
> detailed features, implement them one by one. Do you agree this point?

Yeah that's fine, I would have automatic checking turned off though until we
have a good plan in place for users who do / don't want this.
	--Mark

--
Mark Fasheh

^ permalink raw reply	[flat|nested] 80+ messages in thread

end of thread, other threads:[~2015-12-18 22:37 UTC | newest]

Thread overview: 80+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-28  6:25 [PATCH v2 0/4] Add online file check feature Gang He
2015-10-28  6:25 ` [Ocfs2-devel] " Gang He
2015-10-28  6:25 ` [PATCH v2 1/4] ocfs2: export ocfs2_kset for online file check Gang He
2015-10-28  6:25   ` [Ocfs2-devel] " Gang He
2015-11-24 21:47   ` Mark Fasheh
2015-11-24 21:47     ` [Ocfs2-devel] " Mark Fasheh
2015-10-28  6:25 ` [PATCH v2 2/4] ocfs2: sysfile interfaces " Gang He
2015-10-28  6:25   ` [Ocfs2-devel] " Gang He
2015-11-03  7:20   ` Junxiao Bi
2015-11-03  7:20     ` [Ocfs2-devel] " Junxiao Bi
2015-11-03  7:54     ` Gang He
2015-11-03  7:54       ` [Ocfs2-devel] " Gang He
2015-11-03  8:20       ` Junxiao Bi
2015-11-03  8:20         ` [Ocfs2-devel] " Junxiao Bi
2015-11-03  8:30         ` Gang He
2015-11-03  8:30           ` [Ocfs2-devel] " Gang He
2015-11-24 21:46         ` Mark Fasheh
2015-11-24 21:46           ` [Ocfs2-devel] " Mark Fasheh
2015-11-24 21:55           ` Srinivas Eeda
2015-11-24 21:55             ` Srinivas Eeda
2015-11-25  3:29           ` Gang He
2015-11-25  3:29             ` [Ocfs2-devel] " Gang He
2015-11-25  4:43             ` Junxiao Bi
2015-11-25  4:43               ` [Ocfs2-devel] " Junxiao Bi
2015-11-25  5:11               ` Gang He
2015-11-25  5:11                 ` [Ocfs2-devel] " Gang He
2015-12-18 22:37             ` Mark Fasheh
2015-12-18 22:37               ` [Ocfs2-devel] " Mark Fasheh
2015-11-25  4:33           ` Junxiao Bi
2015-11-25  4:33             ` [Ocfs2-devel] " Junxiao Bi
2015-11-24 21:52   ` Mark Fasheh
2015-11-24 21:52     ` Mark Fasheh
2015-10-28  6:26 ` [PATCH v2 3/4] ocfs2: create/remove sysfile " Gang He
2015-10-28  6:26   ` [Ocfs2-devel] " Gang He
2015-11-24 21:53   ` Mark Fasheh
2015-11-24 21:53     ` [Ocfs2-devel] " Mark Fasheh
2015-10-28  6:26 ` [PATCH v2 4/4] ocfs2: check/fix inode block " Gang He
2015-10-28  6:26   ` [Ocfs2-devel] " Gang He
2015-11-03  7:12   ` Junxiao Bi
2015-11-03  7:12     ` [Ocfs2-devel] " Junxiao Bi
2015-11-03  8:15     ` Gang He
2015-11-03  8:15       ` [Ocfs2-devel] " Gang He
2015-11-03  8:29       ` Junxiao Bi
2015-11-03  8:29         ` [Ocfs2-devel] " Junxiao Bi
2015-11-03  8:47         ` Gang He
2015-11-03  8:47           ` [Ocfs2-devel] " Gang He
2015-11-03  9:01           ` Junxiao Bi
2015-11-03  9:01             ` [Ocfs2-devel] " Junxiao Bi
2015-11-03  9:25             ` Gang He
2015-11-03  9:25               ` [Ocfs2-devel] " Gang He
2015-11-24 22:16     ` Mark Fasheh
2015-11-24 22:16       ` Mark Fasheh
2015-11-25  4:11       ` Junxiao Bi
2015-11-25  4:11         ` Junxiao Bi
2015-11-25  5:04         ` Gang He
2015-11-25  5:04           ` Gang He
2015-11-25  5:44           ` Junxiao Bi
2015-11-25  5:44             ` Junxiao Bi
2015-10-28 16:34 ` [Ocfs2-devel] [PATCH v2 0/4] Add online file check feature Srinivas Eeda
2015-10-28 16:34   ` Srinivas Eeda
2015-10-29  4:44   ` Gang He
2015-10-29  4:44     ` Gang He
2015-10-29  7:46     ` Srinivas Eeda
2015-10-29  7:46       ` Srinivas Eeda
2015-10-29  8:26       ` Gang He
2015-10-29  8:26         ` Gang He
2015-12-02 18:20 ` Pavel Machek
2015-12-02 18:20   ` [Ocfs2-devel] " Pavel Machek
2015-12-03  2:05   ` Gang He
2015-12-03  2:05     ` [Ocfs2-devel] " Gang He
2015-12-03  5:17     ` Greg KH
2015-12-03  5:17       ` [Ocfs2-devel] " Greg KH
2015-12-04  8:36       ` Gang He
2015-12-04  8:36         ` [Ocfs2-devel] " Gang He
2015-12-04  9:20         ` Pavel Machek
2015-12-04  9:20           ` [Ocfs2-devel] " Pavel Machek
2015-12-04 16:40         ` Greg KH
2015-12-04 16:40           ` [Ocfs2-devel] " Greg KH
2015-12-07  3:33           ` Gang He
2015-12-07  3:33             ` [Ocfs2-devel] " Gang He

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.