ocfs2-devel.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
@ 2022-06-03 22:28 Junxiao Bi via Ocfs2-devel
  2022-06-04  8:45 ` heming.zhao--- via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: Junxiao Bi via Ocfs2-devel @ 2022-06-03 22:28 UTC (permalink / raw)
  To: ocfs2-devel

This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.

This commit introduced a regression that can cause mount hung.
The changes in __ocfs2_find_empty_slot causes that any node with
none-zero node number can grab the slot that was already taken by
node 0, so node 1 will access the same journal with node 0, when it
try to grab journal cluster lock, it will hung because it was already
acquired by node 0.
It's very easy to reproduce this, in one cluster, mount node 0 first,
then node 1, you will see the following call trace from node 1.

[13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
[13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
[13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
[13148.749354] Call Trace:
[13148.750718]  <TASK>
[13148.752019]  ? usleep_range+0x90/0x89
[13148.753882]  __schedule+0x210/0x567
[13148.755684]  schedule+0x44/0xa8
[13148.757270]  schedule_timeout+0x106/0x13c
[13148.759273]  ? __prepare_to_swait+0x53/0x78
[13148.761218]  __wait_for_common+0xae/0x163
[13148.763144]  __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
[13148.765780]  ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.768312]  ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
[13148.770968]  ocfs2_journal_init+0x91/0x340 [ocfs2]
[13148.773202]  ocfs2_check_volume+0x39/0x461 [ocfs2]
[13148.775401]  ? iput+0x69/0xba
[13148.777047]  ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
[13148.779646]  ocfs2_fill_super+0x54b/0x853 [ocfs2]
[13148.781756]  mount_bdev+0x190/0x1b7
[13148.783443]  ? ocfs2_remount+0x440/0x440 [ocfs2]
[13148.785634]  legacy_get_tree+0x27/0x48
[13148.787466]  vfs_get_tree+0x25/0xd0
[13148.789270]  do_new_mount+0x18c/0x2d9
[13148.791046]  __x64_sys_mount+0x10e/0x142
[13148.792911]  do_syscall_64+0x3b/0x89
[13148.794667]  entry_SYSCALL_64_after_hwframe+0x170/0x0
[13148.797051] RIP: 0033:0x7f2309f6e26e
[13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
[13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
[13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
[13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
[13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
[13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
[13148.816564]  </TASK>

To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
introduced the feature to mount ocfs2 locally even it is cluster based,
that is a very dangerous, it can easily cause serious data corruption,
there is no way to stop other nodes mounting the fs and corrupting it.
Setup ha or other cluster-aware stack is just the cost that we have to
take for avoiding corruption, otherwise we have to do it in kernel.

Fixes: 912f655d78c5("ocfs2: mount shared volume without ha stack")
Cc: <stable@vger.kernel.org>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
---
 fs/ocfs2/ocfs2.h    |  4 +---
 fs/ocfs2/slot_map.c | 46 +++++++++++++++++++--------------------------
 fs/ocfs2/super.c    | 21 ---------------------
 3 files changed, 20 insertions(+), 51 deletions(-)

diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index 337527571461..740b64238312 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -277,7 +277,6 @@ enum ocfs2_mount_options
 	OCFS2_MOUNT_JOURNAL_ASYNC_COMMIT = 1 << 15,  /* Journal Async Commit */
 	OCFS2_MOUNT_ERRORS_CONT = 1 << 16, /* Return EIO to the calling process on error */
 	OCFS2_MOUNT_ERRORS_ROFS = 1 << 17, /* Change filesystem to read-only on error */
-	OCFS2_MOUNT_NOCLUSTER = 1 << 18, /* No cluster aware filesystem mount */
 };
 
 #define OCFS2_OSB_SOFT_RO	0x0001
@@ -673,8 +672,7 @@ static inline int ocfs2_cluster_o2cb_global_heartbeat(struct ocfs2_super *osb)
 
 static inline int ocfs2_mount_local(struct ocfs2_super *osb)
 {
-	return ((osb->s_feature_incompat & OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT)
-		|| (osb->s_mount_opt & OCFS2_MOUNT_NOCLUSTER));
+	return (osb->s_feature_incompat & OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT);
 }
 
 static inline int ocfs2_uses_extended_slot_map(struct ocfs2_super *osb)
diff --git a/fs/ocfs2/slot_map.c b/fs/ocfs2/slot_map.c
index 0b0ae3ebb0cf..da7718cef735 100644
--- a/fs/ocfs2/slot_map.c
+++ b/fs/ocfs2/slot_map.c
@@ -252,16 +252,14 @@ static int __ocfs2_find_empty_slot(struct ocfs2_slot_info *si,
 	int i, ret = -ENOSPC;
 
 	if ((preferred >= 0) && (preferred < si->si_num_slots)) {
-		if (!si->si_slots[preferred].sl_valid ||
-		    !si->si_slots[preferred].sl_node_num) {
+		if (!si->si_slots[preferred].sl_valid) {
 			ret = preferred;
 			goto out;
 		}
 	}
 
 	for(i = 0; i < si->si_num_slots; i++) {
-		if (!si->si_slots[i].sl_valid ||
-		    !si->si_slots[i].sl_node_num) {
+		if (!si->si_slots[i].sl_valid) {
 			ret = i;
 			break;
 		}
@@ -456,30 +454,24 @@ int ocfs2_find_slot(struct ocfs2_super *osb)
 	spin_lock(&osb->osb_lock);
 	ocfs2_update_slot_info(si);
 
-	if (ocfs2_mount_local(osb))
-		/* use slot 0 directly in local mode */
-		slot = 0;
-	else {
-		/* search for ourselves first and take the slot if it already
-		 * exists. Perhaps we need to mark this in a variable for our
-		 * own journal recovery? Possibly not, though we certainly
-		 * need to warn to the user */
-		slot = __ocfs2_node_num_to_slot(si, osb->node_num);
+	/* search for ourselves first and take the slot if it already
+	 * exists. Perhaps we need to mark this in a variable for our
+	 * own journal recovery? Possibly not, though we certainly
+	 * need to warn to the user */
+	slot = __ocfs2_node_num_to_slot(si, osb->node_num);
+	if (slot < 0) {
+		/* if no slot yet, then just take 1st available
+		 * one. */
+		slot = __ocfs2_find_empty_slot(si, osb->preferred_slot);
 		if (slot < 0) {
-			/* if no slot yet, then just take 1st available
-			 * one. */
-			slot = __ocfs2_find_empty_slot(si, osb->preferred_slot);
-			if (slot < 0) {
-				spin_unlock(&osb->osb_lock);
-				mlog(ML_ERROR, "no free slots available!\n");
-				status = -EINVAL;
-				goto bail;
-			}
-		} else
-			printk(KERN_INFO "ocfs2: Slot %d on device (%s) was "
-			       "already allocated to this node!\n",
-			       slot, osb->dev_str);
-	}
+			spin_unlock(&osb->osb_lock);
+			mlog(ML_ERROR, "no free slots available!\n");
+			status = -EINVAL;
+			goto bail;
+		}
+	} else
+		printk(KERN_INFO "ocfs2: Slot %d on device (%s) was already "
+		       "allocated to this node!\n", slot, osb->dev_str);
 
 	ocfs2_set_slot(si, slot, osb->node_num);
 	osb->slot_num = slot;
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index f7298816d8d9..438be028935d 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -172,7 +172,6 @@ enum {
 	Opt_dir_resv_level,
 	Opt_journal_async_commit,
 	Opt_err_cont,
-	Opt_nocluster,
 	Opt_err,
 };
 
@@ -206,7 +205,6 @@ static const match_table_t tokens = {
 	{Opt_dir_resv_level, "dir_resv_level=%u"},
 	{Opt_journal_async_commit, "journal_async_commit"},
 	{Opt_err_cont, "errors=continue"},
-	{Opt_nocluster, "nocluster"},
 	{Opt_err, NULL}
 };
 
@@ -618,13 +616,6 @@ static int ocfs2_remount(struct super_block *sb, int *flags, char *data)
 		goto out;
 	}
 
-	tmp = OCFS2_MOUNT_NOCLUSTER;
-	if ((osb->s_mount_opt & tmp) != (parsed_options.mount_opt & tmp)) {
-		ret = -EINVAL;
-		mlog(ML_ERROR, "Cannot change nocluster option on remount\n");
-		goto out;
-	}
-
 	tmp = OCFS2_MOUNT_HB_LOCAL | OCFS2_MOUNT_HB_GLOBAL |
 		OCFS2_MOUNT_HB_NONE;
 	if ((osb->s_mount_opt & tmp) != (parsed_options.mount_opt & tmp)) {
@@ -865,7 +856,6 @@ static int ocfs2_verify_userspace_stack(struct ocfs2_super *osb,
 	}
 
 	if (ocfs2_userspace_stack(osb) &&
-	    !(osb->s_mount_opt & OCFS2_MOUNT_NOCLUSTER) &&
 	    strncmp(osb->osb_cluster_stack, mopt->cluster_stack,
 		    OCFS2_STACK_LABEL_LEN)) {
 		mlog(ML_ERROR,
@@ -1137,11 +1127,6 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 	       osb->s_mount_opt & OCFS2_MOUNT_DATA_WRITEBACK ? "writeback" :
 	       "ordered");
 
-	if ((osb->s_mount_opt & OCFS2_MOUNT_NOCLUSTER) &&
-	   !(osb->s_feature_incompat & OCFS2_FEATURE_INCOMPAT_LOCAL_MOUNT))
-		printk(KERN_NOTICE "ocfs2: The shared device (%s) is mounted "
-		       "without cluster aware mode.\n", osb->dev_str);
-
 	atomic_set(&osb->vol_state, VOLUME_MOUNTED);
 	wake_up(&osb->osb_mount_event);
 
@@ -1452,9 +1437,6 @@ static int ocfs2_parse_options(struct super_block *sb,
 		case Opt_journal_async_commit:
 			mopt->mount_opt |= OCFS2_MOUNT_JOURNAL_ASYNC_COMMIT;
 			break;
-		case Opt_nocluster:
-			mopt->mount_opt |= OCFS2_MOUNT_NOCLUSTER;
-			break;
 		default:
 			mlog(ML_ERROR,
 			     "Unrecognized mount option \"%s\" "
@@ -1566,9 +1548,6 @@ static int ocfs2_show_options(struct seq_file *s, struct dentry *root)
 	if (opts & OCFS2_MOUNT_JOURNAL_ASYNC_COMMIT)
 		seq_printf(s, ",journal_async_commit");
 
-	if (opts & OCFS2_MOUNT_NOCLUSTER)
-		seq_printf(s, ",nocluster");
-
 	return 0;
 }
 
-- 
2.24.3 (Apple Git-128)


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-03 22:28 [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack" Junxiao Bi via Ocfs2-devel
@ 2022-06-04  8:45 ` heming.zhao--- via Ocfs2-devel
  2022-06-04 16:19   ` Junxiao Bi via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-06-04  8:45 UTC (permalink / raw)
  To: Junxiao Bi, ocfs2-devel

Hello Junxiao,

On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
> 
> This commit introduced a regression that can cause mount hung.
> The changes in __ocfs2_find_empty_slot causes that any node with
> none-zero node number can grab the slot that was already taken by
> node 0, so node 1 will access the same journal with node 0, when it
> try to grab journal cluster lock, it will hung because it was already
> acquired by node 0.
> It's very easy to reproduce this, in one cluster, mount node 0 first,
> then node 1, you will see the following call trace from node 1.

 From your description, it looks your env mixed local-mount & clustered-mount.

Could you mind to share your test/reproducible steps.
And which ha stack do you use, pmck or o2cb?

I failed to reproduce it, my test steps (with pcmk stack):
```
node1:
mount -t ocfs2 /dev/vdd /mnt

node2:
for i in {1..100}; do
  echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
  sleep 3;
  echo "umount"; umount /mnt;
done
```

This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
I want to find whether there is a idear way to fix the hung issue.

> 
> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
> [13148.749354] Call Trace:
> [13148.750718]  <TASK>
> [13148.752019]  ? usleep_range+0x90/0x89
> [13148.753882]  __schedule+0x210/0x567
> [13148.755684]  schedule+0x44/0xa8
> [13148.757270]  schedule_timeout+0x106/0x13c
> [13148.759273]  ? __prepare_to_swait+0x53/0x78
> [13148.761218]  __wait_for_common+0xae/0x163
> [13148.763144]  __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
> [13148.765780]  ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
> [13148.768312]  ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
> [13148.770968]  ocfs2_journal_init+0x91/0x340 [ocfs2]
> [13148.773202]  ocfs2_check_volume+0x39/0x461 [ocfs2]
> [13148.775401]  ? iput+0x69/0xba
> [13148.777047]  ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
> [13148.779646]  ocfs2_fill_super+0x54b/0x853 [ocfs2]
> [13148.781756]  mount_bdev+0x190/0x1b7
> [13148.783443]  ? ocfs2_remount+0x440/0x440 [ocfs2]
> [13148.785634]  legacy_get_tree+0x27/0x48
> [13148.787466]  vfs_get_tree+0x25/0xd0
> [13148.789270]  do_new_mount+0x18c/0x2d9
> [13148.791046]  __x64_sys_mount+0x10e/0x142
> [13148.792911]  do_syscall_64+0x3b/0x89
> [13148.794667]  entry_SYSCALL_64_after_hwframe+0x170/0x0
> [13148.797051] RIP: 0033:0x7f2309f6e26e
> [13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
> [13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
> [13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
> [13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
> [13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
> [13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
> [13148.816564]  </TASK>
> 
> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
> introduced the feature to mount ocfs2 locally even it is cluster based,
> that is a very dangerous, it can easily cause serious data corruption,
> there is no way to stop other nodes mounting the fs and corrupting it.

I can't follow your meaning. When users want to use local mount feature, they MUST know
what they are doing, and how to use it.

 From mount.ocfs2 (8), there also writes *only* mount fs on *one* node at the same time.
And also tell user fs will be damaged under wrong action.

```
nocluster

   This  option  allows  users  to  mount a clustered volume without configuring the cluster

   stack.  However, you must be aware that you can only mount the file system from one  node

   at the same time, otherwise, the file system may be damaged. Please use it with caution.
```

> Setup ha or other cluster-aware stack is just the cost that we have to
> take for avoiding corruption, otherwise we have to do it in kernel.

It's a little bit serious to totally revert this commit just under lacking sanity
check. If you or maintainer think the local mount should do more jobs to prevent mix
local-mount and clustered-mount scenario, we could add more sanity check during
local mounting.

Thanks,
Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-04  8:45 ` heming.zhao--- via Ocfs2-devel
@ 2022-06-04 16:19   ` Junxiao Bi via Ocfs2-devel
  2022-06-25 13:30     ` Joseph Qi via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: Junxiao Bi via Ocfs2-devel @ 2022-06-04 16:19 UTC (permalink / raw)
  To: heming.zhao; +Cc: ocfs2-devel



> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
> 
> Hello Junxiao,
> 
>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
>> This commit introduced a regression that can cause mount hung.
>> The changes in __ocfs2_find_empty_slot causes that any node with
>> none-zero node number can grab the slot that was already taken by
>> node 0, so node 1 will access the same journal with node 0, when it
>> try to grab journal cluster lock, it will hung because it was already
>> acquired by node 0.
>> It's very easy to reproduce this, in one cluster, mount node 0 first,
>> then node 1, you will see the following call trace from node 1.
> 
> From your description, it looks your env mixed local-mount & clustered-mount.
No, only cluster mount.
> 
> Could you mind to share your test/reproducible steps.
> And which ha stack do you use, pmck or o2cb?
> 
> I failed to reproduce it, my test steps (with pcmk stack):
> ```
> node1:
> mount -t ocfs2 /dev/vdd /mnt
> 
> node2:
> for i in {1..100}; do
> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
> sleep 3;
> echo "umount"; umount /mnt;
> done
> ```
> 
Try set one node with node number 0 and mount it there first. I used o2cb stack.
> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
> I want to find whether there is a idear way to fix the hung issue.
> 
>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
>> [13148.749354] Call Trace:
>> [13148.750718]  <TASK>
>> [13148.752019]  ? usleep_range+0x90/0x89
>> [13148.753882]  __schedule+0x210/0x567
>> [13148.755684]  schedule+0x44/0xa8
>> [13148.757270]  schedule_timeout+0x106/0x13c
>> [13148.759273]  ? __prepare_to_swait+0x53/0x78
>> [13148.761218]  __wait_for_common+0xae/0x163
>> [13148.763144]  __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
>> [13148.765780]  ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
>> [13148.768312]  ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
>> [13148.770968]  ocfs2_journal_init+0x91/0x340 [ocfs2]
>> [13148.773202]  ocfs2_check_volume+0x39/0x461 [ocfs2]
>> [13148.775401]  ? iput+0x69/0xba
>> [13148.777047]  ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
>> [13148.779646]  ocfs2_fill_super+0x54b/0x853 [ocfs2]
>> [13148.781756]  mount_bdev+0x190/0x1b7
>> [13148.783443]  ? ocfs2_remount+0x440/0x440 [ocfs2]
>> [13148.785634]  legacy_get_tree+0x27/0x48
>> [13148.787466]  vfs_get_tree+0x25/0xd0
>> [13148.789270]  do_new_mount+0x18c/0x2d9
>> [13148.791046]  __x64_sys_mount+0x10e/0x142
>> [13148.792911]  do_syscall_64+0x3b/0x89
>> [13148.794667]  entry_SYSCALL_64_after_hwframe+0x170/0x0
>> [13148.797051] RIP: 0033:0x7f2309f6e26e
>> [13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
>> [13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
>> [13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
>> [13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
>> [13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
>> [13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
>> [13148.816564]  </TASK>
>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
>> introduced the feature to mount ocfs2 locally even it is cluster based,
>> that is a very dangerous, it can easily cause serious data corruption,
>> there is no way to stop other nodes mounting the fs and corrupting it.
> 
> I can't follow your meaning. When users want to use local mount feature, they MUST know
> what they are doing, and how to use it.
I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
> 
> From mount.ocfs2 (8), there also writes *only* mount fs on *one* node at the same time.
> And also tell user fs will be damaged under wrong action.
> 
> ```
> nocluster
> 
>  This  option  allows  users  to  mount a clustered volume without configuring the cluster
> 
>  stack.  However, you must be aware that you can only mount the file system from one  node
> 
>  at the same time, otherwise, the file system may be damaged. Please use it with caution.
> ```
> 
>> Setup ha or other cluster-aware stack is just the cost that we have to
>> take for avoiding corruption, otherwise we have to do it in kernel.
> 
> It's a little bit serious to totally revert this commit just under lacking sanity
> check. If you or maintainer think the local mount should do more jobs to prevent mix
> local-mount and clustered-mount scenario, we could add more sanity check during
> local mounting.
I don’t think this should be done in kernel. Setup cluster stack is the way to forward.

Thanks,
Junxiao
> 
> Thanks,
> Heming
> 
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-04 16:19   ` Junxiao Bi via Ocfs2-devel
@ 2022-06-25 13:30     ` Joseph Qi via Ocfs2-devel
  0 siblings, 0 replies; 15+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-06-25 13:30 UTC (permalink / raw)
  To: Junxiao Bi, akpm; +Cc: ocfs2-devel

Since I've missed the original mail in my mailbox so I send reply here.

As discussed in this and another long thread, this feature is incomplete
and we don't have a better fix as of now. And it has caused a regression
with the default stack o2cb, so revert it first as a quick fix.

And We can re-take this feature once it is mature in the future.

Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com>

On 6/5/22 12:19 AM, Junxiao Bi via Ocfs2-devel wrote:
> 
> 
>> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
>>
>> Hello Junxiao,
>>
>>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
>>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
>>> This commit introduced a regression that can cause mount hung.
>>> The changes in __ocfs2_find_empty_slot causes that any node with
>>> none-zero node number can grab the slot that was already taken by
>>> node 0, so node 1 will access the same journal with node 0, when it
>>> try to grab journal cluster lock, it will hung because it was already
>>> acquired by node 0.
>>> It's very easy to reproduce this, in one cluster, mount node 0 first,
>>> then node 1, you will see the following call trace from node 1.
>>
>> From your description, it looks your env mixed local-mount & clustered-mount.
> No, only cluster mount.
>>
>> Could you mind to share your test/reproducible steps.
>> And which ha stack do you use, pmck or o2cb?
>>
>> I failed to reproduce it, my test steps (with pcmk stack):
>> ```
>> node1:
>> mount -t ocfs2 /dev/vdd /mnt
>>
>> node2:
>> for i in {1..100}; do
>> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
>> sleep 3;
>> echo "umount"; umount /mnt;
>> done
>> ```
>>
> Try set one node with node number 0 and mount it there first. I used o2cb stack.
>> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
>> I want to find whether there is a idear way to fix the hung issue.
>>
>>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
>>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
>>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
>>> [13148.749354] Call Trace:
>>> [13148.750718]  <TASK>
>>> [13148.752019]  ? usleep_range+0x90/0x89
>>> [13148.753882]  __schedule+0x210/0x567
>>> [13148.755684]  schedule+0x44/0xa8
>>> [13148.757270]  schedule_timeout+0x106/0x13c
>>> [13148.759273]  ? __prepare_to_swait+0x53/0x78
>>> [13148.761218]  __wait_for_common+0xae/0x163
>>> [13148.763144]  __ocfs2_cluster_lock.constprop.0+0x1d6/0x870 [ocfs2]
>>> [13148.765780]  ? ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
>>> [13148.768312]  ocfs2_inode_lock_full_nested+0x18d/0x398 [ocfs2]
>>> [13148.770968]  ocfs2_journal_init+0x91/0x340 [ocfs2]
>>> [13148.773202]  ocfs2_check_volume+0x39/0x461 [ocfs2]
>>> [13148.775401]  ? iput+0x69/0xba
>>> [13148.777047]  ocfs2_mount_volume.isra.0.cold+0x40/0x1f5 [ocfs2]
>>> [13148.779646]  ocfs2_fill_super+0x54b/0x853 [ocfs2]
>>> [13148.781756]  mount_bdev+0x190/0x1b7
>>> [13148.783443]  ? ocfs2_remount+0x440/0x440 [ocfs2]
>>> [13148.785634]  legacy_get_tree+0x27/0x48
>>> [13148.787466]  vfs_get_tree+0x25/0xd0
>>> [13148.789270]  do_new_mount+0x18c/0x2d9
>>> [13148.791046]  __x64_sys_mount+0x10e/0x142
>>> [13148.792911]  do_syscall_64+0x3b/0x89
>>> [13148.794667]  entry_SYSCALL_64_after_hwframe+0x170/0x0
>>> [13148.797051] RIP: 0033:0x7f2309f6e26e
>>> [13148.798784] RSP: 002b:00007ffdcee7d408 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
>>> [13148.801974] RAX: ffffffffffffffda RBX: 00007ffdcee7d4a0 RCX: 00007f2309f6e26e
>>> [13148.804815] RDX: 0000559aa762a8ae RSI: 0000559aa939d340 RDI: 0000559aa93a22b0
>>> [13148.807719] RBP: 00007ffdcee7d5b0 R08: 0000559aa93a2290 R09: 00007f230a0b4820
>>> [13148.810659] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffdcee7d420
>>> [13148.813609] R13: 0000000000000000 R14: 0000559aa939f000 R15: 0000000000000000
>>> [13148.816564]  </TASK>
>>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
>>> introduced the feature to mount ocfs2 locally even it is cluster based,
>>> that is a very dangerous, it can easily cause serious data corruption,
>>> there is no way to stop other nodes mounting the fs and corrupting it.
>>
>> I can't follow your meaning. When users want to use local mount feature, they MUST know
>> what they are doing, and how to use it.
> I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
> It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
>>
>> From mount.ocfs2 (8), there also writes *only* mount fs on *one* node at the same time.
>> And also tell user fs will be damaged under wrong action.
>>
>> ```
>> nocluster
>>
>>  This  option  allows  users  to  mount a clustered volume without configuring the cluster
>>
>>  stack.  However, you must be aware that you can only mount the file system from one  node
>>
>>  at the same time, otherwise, the file system may be damaged. Please use it with caution.
>> ```
>>
>>> Setup ha or other cluster-aware stack is just the cost that we have to
>>> take for avoiding corruption, otherwise we have to do it in kernel.
>>
>> It's a little bit serious to totally revert this commit just under lacking sanity
>> check. If you or maintainer think the local mount should do more jobs to prevent mix
>> local-mount and clustered-mount scenario, we could add more sanity check during
>> local mounting.
> I don’t think this should be done in kernel. Setup cluster stack is the way to forward.

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-07  6:31             ` Joseph Qi via Ocfs2-devel
@ 2022-06-07  9:42               ` Heming Zhao via Ocfs2-devel
  0 siblings, 0 replies; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-06-07  9:42 UTC (permalink / raw)
  To: Joseph Qi; +Cc: ocfs2-devel

On Tue, Jun 07, 2022 at 02:31:01PM +0800, Joseph Qi wrote:
> 
> 
> On 6/7/22 11:06 AM, Junxiao Bi wrote:
> >>>> Seems I am missing some mails for this thread.
> >>>> The 'nocluster' mount is introduced by Gang and I think it has real
> >>>> user scenarios. I am curious about since node 0 is commonly used in
> >>>> o2cb, why there is no any bug report before.
> >>>> So let's try to fix the regression first.
> >>> Real user case doesn’t mean this has to been done through kernel? This sounds like doing something in kernel that is to workaround some issue that can be done from user space.
> >>> I didn’t see a Reviewed-by for the patch, how did it get merged?
> >> Gang had left SUSE for some time, and busy with his new job.
> >> I have vague memory, he said this commit approved & merged directly by Andrew Morton.
> >> Gang dedicated to contribute ocfs2 community many years, and set up his competence
> >> to other maintainers & reviewers.
> >>
> >> If Junxiao dislike this feature, and don't want to fix it as a bug.
> >> I am willing to file a patch.
> > To fix, it’s not only the regression it causes, but also do something in mount.ocfs2 to check whether any node is mounting the volume before nocluster mount and also stop other nodes mounting before  nocluster mount is unmounted. That’s to simulate what fsck.ocfs2/mkfs.ocfs2 do. Without that guard, this feature just provides a new way for customer to corrupt their data. It’s just time some customer would do something bad and lost their data. 
> > I will leave how to handle this to you and Joseph, we already reverted this patch.
> 
> Searched the maillist and find the original thread for reference:
> https://lore.kernel.org/ocfs2-devel/CH2PR18MB32064CCD80FE98F03B82A816CFAC0@CH2PR18MB3206.namprd18.prod.outlook.com/
> 
> I suggest we leave nocluster mount as a special mode and make its logic
> won't impact other mode like cluster or local mount.
> Agree with Junxiao, we have to try our best to prevent data corruption
> even mistakenly used by customer.
> 

It's a bad news for oracle people to revert this feature.
As I said that the hung is not a big bug.

Under Junxiao's commit log information, commit 912f655d78c5d introduced bug code
in below area.

@@ -254,14 +254,16 @@ static int __ocfs2_find_empty_slot(struct
ocfs2_slot_info *si,
     int i, ret = -ENOSPC;
 
     if ((preferred >= 0) && (preferred < si->si_num_slots)) {
-        if (!si->si_slots[preferred].sl_valid) {
+        if (!si->si_slots[preferred].sl_valid ||
+            !si->si_slots[preferred].sl_node_num) {
             ret = preferred;
             goto out;
         }
     }
 
     for(i = 0; i < si->si_num_slots; i++) {
-        if (!si->si_slots[i].sl_valid) {
+        if (!si->si_slots[i].sl_valid ||
+            !si->si_slots[i].sl_node_num) {
             ret = i;
             break;
         }

the 'if' accessment is wrong. sl_node_num could be 0 at o2cb env.

with current information, the trigger flow (may):
1>
node1 with 'node_num = 0' for mounting. it will succeed.
at this time, slotmap extent block will contains es_valid:1 & es_node_num:0 for node1
then ocfs2_update_disk_slot() will write back slotmap info to disk.

2>
then, node2 with 'node_num = 1' for mounting

ocfs2_find_slot
 + ocfs2_update_slot_info //read slotmap info from disk
 |  + set si->si_slots[0].es_valid = 1 & si->si_slots[0].sl_node_num = 0
 |
 + __ocfs2_node_num_to_slot //will return -ENOENT.
    __ocfs2_find_empty_slot
     + search preferred (node_num:1) failed
     + 'si->si_slots[0].sl_node_num' is false. trigger 'break' condition.
     + return slot 0  //will cause node2 grab node1 journal dlm lock, trigger hung.


I copied the related comments from Joseph provided URL.
> https://lore.kernel.org/ocfs2-devel/CH2PR18MB32064CCD80FE98F03B82A816CFAC0@CH2PR18MB3206.namprd18.prod.outlook.com/

```
> > @@ -254,14 +254,16 @@ static int __ocfs2_find_empty_slot(struct
> ocfs2_slot_info *si,
> >  	int i, ret = -ENOSPC;
> >
> >  	if ((preferred >= 0) && (preferred < si->si_num_slots)) {
> > -		if (!si->si_slots[preferred].sl_valid) {
> > +		if (!si->si_slots[preferred].sl_valid ||
> > +		    !si->si_slots[preferred].sl_node_num) {
> 
> Why specially handle node num 0 here?
> It seems breaks original logic.
Since in local(or nocluster) mode, the code will not invoke any DLM/cluster related interfaces.
The node_num is set to 0 directly.
In the past, local mount/cluster mount will not happen on the same volume.
But, after nocluster option is introduced, local(nocluster) mount/cluster mount will possibly happen(not at the same time) on the same volume. 
If we mount the shared volume with the cluster mode after a local (nocluster mode) mount crash,
We have to use that slot(slot 0, which was used by the last local mount), otherwise, there is possibly not more slot available.
```

Gang comment flow: (based on patched code)
1. do noclustered mount, use slot 0
2. crash, reboot
3. do clustered mount.
4. in slot 0, sl_valid:1, ls_node_num:0
   ocfs2_find_slot
    + __ocfs2_node_num_to_slot()
    |  //fails for 'sl_valid == 1'
    |
    + __ocfs2_find_empty_slot()
       //should reuse the noclustered mount occupied slot 0
       //so 'sl_node_num == 0' should be treated as an empty slot for reuse.
       //if not reuse this slot, will don't have enough slot for mount (other nodes).

at last, I need some time to find a solution.

Thanks,
Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-07  3:06           ` Junxiao Bi via Ocfs2-devel
@ 2022-06-07  6:31             ` Joseph Qi via Ocfs2-devel
  2022-06-07  9:42               ` Heming Zhao via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-06-07  6:31 UTC (permalink / raw)
  To: Junxiao Bi, heming.zhao; +Cc: ocfs2-devel



On 6/7/22 11:06 AM, Junxiao Bi wrote:
>>>> Seems I am missing some mails for this thread.
>>>> The 'nocluster' mount is introduced by Gang and I think it has real
>>>> user scenarios. I am curious about since node 0 is commonly used in
>>>> o2cb, why there is no any bug report before.
>>>> So let's try to fix the regression first.
>>> Real user case doesn’t mean this has to been done through kernel? This sounds like doing something in kernel that is to workaround some issue that can be done from user space.
>>> I didn’t see a Reviewed-by for the patch, how did it get merged?
>> Gang had left SUSE for some time, and busy with his new job.
>> I have vague memory, he said this commit approved & merged directly by Andrew Morton.
>> Gang dedicated to contribute ocfs2 community many years, and set up his competence
>> to other maintainers & reviewers.
>>
>> If Junxiao dislike this feature, and don't want to fix it as a bug.
>> I am willing to file a patch.
> To fix, it’s not only the regression it causes, but also do something in mount.ocfs2 to check whether any node is mounting the volume before nocluster mount and also stop other nodes mounting before  nocluster mount is unmounted. That’s to simulate what fsck.ocfs2/mkfs.ocfs2 do. Without that guard, this feature just provides a new way for customer to corrupt their data. It’s just time some customer would do something bad and lost their data. 
> I will leave how to handle this to you and Joseph, we already reverted this patch.

Searched the maillist and find the original thread for reference:
https://lore.kernel.org/ocfs2-devel/CH2PR18MB32064CCD80FE98F03B82A816CFAC0@CH2PR18MB3206.namprd18.prod.outlook.com/

I suggest we leave nocluster mount as a special mode and make its logic
won't impact other mode like cluster or local mount.
Agree with Junxiao, we have to try our best to prevent data corruption
even mistakenly used by customer.

Thanks,
Joseph

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-07  2:38         ` heming.zhao--- via Ocfs2-devel
@ 2022-06-07  3:06           ` Junxiao Bi via Ocfs2-devel
  2022-06-07  6:31             ` Joseph Qi via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: Junxiao Bi via Ocfs2-devel @ 2022-06-07  3:06 UTC (permalink / raw)
  To: heming.zhao; +Cc: ocfs2-devel



> 在 2022年6月6日,下午7:38,heming.zhao@suse.com 写道:
> 
> On 6/7/22 10:21, Junxiao Bi wrote:
>>>> 在 2022年6月6日,下午7:07,Joseph Qi <joseph.qi@linux.alibaba.com> 写道:
>>> 
>>> 
>>> 
>>>> On 6/7/22 7:50 AM, heming.zhao@suse.com wrote:
>>>>> On 6/7/22 00:15, Junxiao Bi wrote:
>>>>>> On 6/5/22 7:08 PM, heming.zhao@suse.com wrote:
>>>>> 
>>>>>> Hello Junxiao,
>>>>>> 
>>>>>> First of all, let's turn to the same channel to discuss your patch.
>>>>>> There are two features: 'local mount' & 'nocluster mount'.
>>>>>> I mistakenly wrote local-mount on some place in previous mails.
>>>>>> This patch revert commit 912f655d78c5d4, which is related with 'nocluster mount'.
>>>>>> 
>>>>>> 
>>>>>> On 6/5/22 00:19, Junxiao Bi wrote:
>>>>>>> 
>>>>>>>> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
>>>>>>>> 
>>>>>>>> Hello Junxiao,
>>>>>>>> 
>>>>>>>>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
>>>>>>>>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
>>>>>>>>> This commit introduced a regression that can cause mount hung.
>>>>>>>>> The changes in __ocfs2_find_empty_slot causes that any node with
>>>>>>>>> none-zero node number can grab the slot that was already taken by
>>>>>>>>> node 0, so node 1 will access the same journal with node 0, when it
>>>>>>>>> try to grab journal cluster lock, it will hung because it was already
>>>>>>>>> acquired by node 0.
>>>>>>>>> It's very easy to reproduce this, in one cluster, mount node 0 first,
>>>>>>>>> then node 1, you will see the following call trace from node 1.
>>>>>>>>   From your description, it looks your env mixed local-mount & clustered-mount.
>>>>>>> No, only cluster mount.
>>>>>>>> Could you mind to share your test/reproducible steps.
>>>>>>>> And which ha stack do you use, pmck or o2cb?
>>>>>>>> 
>>>>>>>> I failed to reproduce it, my test steps (with pcmk stack):
>>>>>>>> ```
>>>>>>>> node1:
>>>>>>>> mount -t ocfs2 /dev/vdd /mnt
>>>>>>>> 
>>>>>>>> node2:
>>>>>>>> for i in {1..100}; do
>>>>>>>> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
>>>>>>>> sleep 3;
>>>>>>>> echo "umount"; umount /mnt;
>>>>>>>> done
>>>>>>>> ```
>>>>>>>> 
>>>>>>> Try set one node with node number 0 and mount it there first. I used o2cb stack.
>>>>>> Could you show more test info/steps. I can't follow your meaning.
>>>>>> How to set up a node with a fix node number?
>>>>>> With my understanding, under pcmk env, the first mounted node will auto got node
>>>>>> number 1 (or any value great than 0). and there is no place to set node number
>>>>>> by hand. It's very likely you mixed to use nocluster & cluster mount.
>>>>>> If my suspect right (mixed mount), your use case is wrong.
>>>>> 
>>>>> Did you check my last mail? I already said i didn't do mixed mount, only cluster mount.
>>>> 
>>>> I carefully read every word of your mails. we are in different world. (pcmk vs o2cb)
>>>> In pcmk env, slot number always great than 0. (I also maintain cluster-md in suse,
>>>> in slot_number@drivers/md/md-cluster.c, you can see the number never ZERO).
>>>> 
>>>>> 
>>>>> There is a configure file for o2cb, you can just set node number to 0, please check https://docs.oracle.com/en/operating-systems/oracle-linux/7/fsadmin/ol7-ocfs2.html#ol7-config-file-ocfs2
>>>> 
>>>> Thank you for sharing. I will read & learn it.
>>>> 
>>>>> 
>>>>>> 
>>>>>>>> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
>>>>>>>> I want to find whether there is a idear way to fix the hung issue.
>>>>>>>> 
>>>>>>>>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
>>>>>>>>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
>>>>>>>>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>>>>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
>>>>>>>>> [13148.749354] Call Trace:
>>>>>>>>> ...
>>>>>>>>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
>>>>>>>>> introduced the feature to mount ocfs2 locally even it is cluster based,
>>>>>>>>> that is a very dangerous, it can easily cause serious data corruption,
>>>>>>>>> there is no way to stop other nodes mounting the fs and corrupting it.
>>>>>>>> I can't follow your meaning. When users want to use local mount feature, they MUST know
>>>>>>>> what they are doing, and how to use it.
>>>>>>> I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
>>>>>> yes, no one reads doc by default.
>>>>>> 
>>>>>> currently, mount with option 'nocluster' will show special info to user:
>>>>>> 
>>>>>> ```
>>>>>> # mount -t ocfs2 -o nocluster /dev/vdd /mnt
>>>>>> Warning: to mount a clustered volume without the cluster stack.
>>>>>> Please make sure you only mount the file system from one node.
>>>>>> Otherwise, the file system may be damaged.
>>>>>> Proceed (y/N):
>>>>>> ```
>>>>>> 
>>>>>>> It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
>>>>>> This feature is not enabled by default, and also shows enough info/warn before executing.
>>>>>> I give (may awkward) another example:
>>>>>> nocluster mount likes executing command 'rm -rf /', do you think we should
>>>>>> tell/educate customer do not execute it?
>>>>> 
>>>>> That's totally out of domain of ocfs2, it's not ocfs2 developer's job to tell customer not doing that.
>>>>> 
>>>>> Here you provided a ocfs2 feature that can easily corrupt ocfs2.
>>>> 
>>>> First, this hung issue or any other related issues can be fixed. I have already
>>>> described a method in my previous mail. (use es_reserved1[0] of ocfs2_extended_slot)
>>>> 
>>>> Second, this feature have been merged two years, only you reported a hung issue.
>>>> Our customer also uses it for 2 years, no bug reported from them. it means,
>>>> at least, this feature fine works in pcmk stack.
>>>> 
>>>>> 
>>>>> As an ocfs2 developer, you should make sure ocfs2 was not corrupted even customer did something bad. That's why mkfs.ocfs2/fsck.ocfs2 check whether ocfs2 volume is mounted in the cluster before changing anything.
>>>> 
>>>> fsck.ocfs2 with '-F' could work in noclustered env.
>>>> In 2005, commit 44c97d6ce8baeb4a6c37712d4c22d0702ebf7714 introduced this feature.
>>>> This year, commit 7085e9177adc7197250d872c50a05dfc9c531bdc enhanced it,
>>>> which could make fsck.ocfs2 totally work in noclustered env.
>>>> (mkfs.ocfs2 can also work in local mount mode which is another story)
>>>> 
>>>>> 
>>>>>> 
>>>>>> The nocluster mount feature was designed to resolve customer pain point from real world:
>>>>>> SUSE HA stack uses pacemaker+corosync+fsdlm+ocfs2, which complicates/inconveniences
>>>>>> to set up. and need to install dozens of related packages.
>>>>>> 
>>>>>> The nocluster feature main use case:
>>>>>> customer wants to avoid to set up HA stack, but they wants to check ocfs2 volume
>>>>>> or do backup volume.
>>>>> That doesn't mean you have to do this in kernel. Customer had a pain to setup HA stack, you should develop some script/app to make it easy.
>>>> 
>>>> It's not a simple job to develop some script/app to help setup ha stack.
>>>> Both SUSE and Red Hat have special team to do this. In SUSE, this team have worked many years.
>>>> If any one can create an easy/powerful HA auto setup tools, He can even found a company
>>>> to sell this software.
>>>> 
>>>>>> 
>>>>>> In my opinion, we should make ocfs2 more powerful and include more useful features for users.
>>>>>> If there are some problems related new feature, we should do our best to fix it not revert it.
>>>>> 
>>>>> Only good/safe features, i don't think this one is qualified. Also no one give a reviewed-by to this commit, i am not sure how it was merged.
>>>> 
>>>> I had shared my idea about how to fix this hung issue. it's not a big bug.
>>>> More useful feature could attract more users, it will make ocfs2 community more powerful.
>>>> 
>>>>> 
>>>>> Joseph, what's your call on this?
>>>> 
>>>> me too, wait for maintainer feedback.
>>>> 
>>> 
>>> Seems I am missing some mails for this thread.
>>> The 'nocluster' mount is introduced by Gang and I think it has real
>>> user scenarios. I am curious about since node 0 is commonly used in
>>> o2cb, why there is no any bug report before.
>>> So let's try to fix the regression first.
>> Real user case doesn’t mean this has to been done through kernel? This sounds like doing something in kernel that is to workaround some issue that can be done from user space.
>> I didn’t see a Reviewed-by for the patch, how did it get merged?
> 
> Gang had left SUSE for some time, and busy with his new job.
> I have vague memory, he said this commit approved & merged directly by Andrew Morton.
> Gang dedicated to contribute ocfs2 community many years, and set up his competence
> to other maintainers & reviewers.
> 
> If Junxiao dislike this feature, and don't want to fix it as a bug.
> I am willing to file a patch.
To fix, it’s not only the regression it causes, but also do something in mount.ocfs2 to check whether any node is mounting the volume before nocluster mount and also stop other nodes mounting before  nocluster mount is unmounted. That’s to simulate what fsck.ocfs2/mkfs.ocfs2 do. Without that guard, this feature just provides a new way for customer to corrupt their data. It’s just time some customer would do something bad and lost their data. 
I will leave how to handle this to you and Joseph, we already reverted this patch.

Thanks,
Junxiao
> 
> /Heming
> 
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-07  2:21       ` Junxiao Bi via Ocfs2-devel
@ 2022-06-07  2:38         ` heming.zhao--- via Ocfs2-devel
  2022-06-07  3:06           ` Junxiao Bi via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-06-07  2:38 UTC (permalink / raw)
  To: Junxiao Bi, Joseph Qi; +Cc: ocfs2-devel

On 6/7/22 10:21, Junxiao Bi wrote:
> 
> 
> 
>> 在 2022年6月6日,下午7:07,Joseph Qi <joseph.qi@linux.alibaba.com> 写道:
>>
>> 
>>
>>> On 6/7/22 7:50 AM, heming.zhao@suse.com wrote:
>>>> On 6/7/22 00:15, Junxiao Bi wrote:
>>>>> On 6/5/22 7:08 PM, heming.zhao@suse.com wrote:
>>>>
>>>>> Hello Junxiao,
>>>>>
>>>>> First of all, let's turn to the same channel to discuss your patch.
>>>>> There are two features: 'local mount' & 'nocluster mount'.
>>>>> I mistakenly wrote local-mount on some place in previous mails.
>>>>> This patch revert commit 912f655d78c5d4, which is related with 'nocluster mount'.
>>>>>
>>>>>
>>>>> On 6/5/22 00:19, Junxiao Bi wrote:
>>>>>>
>>>>>>> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
>>>>>>>
>>>>>>> Hello Junxiao,
>>>>>>>
>>>>>>>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
>>>>>>>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
>>>>>>>> This commit introduced a regression that can cause mount hung.
>>>>>>>> The changes in __ocfs2_find_empty_slot causes that any node with
>>>>>>>> none-zero node number can grab the slot that was already taken by
>>>>>>>> node 0, so node 1 will access the same journal with node 0, when it
>>>>>>>> try to grab journal cluster lock, it will hung because it was already
>>>>>>>> acquired by node 0.
>>>>>>>> It's very easy to reproduce this, in one cluster, mount node 0 first,
>>>>>>>> then node 1, you will see the following call trace from node 1.
>>>>>>>    From your description, it looks your env mixed local-mount & clustered-mount.
>>>>>> No, only cluster mount.
>>>>>>> Could you mind to share your test/reproducible steps.
>>>>>>> And which ha stack do you use, pmck or o2cb?
>>>>>>>
>>>>>>> I failed to reproduce it, my test steps (with pcmk stack):
>>>>>>> ```
>>>>>>> node1:
>>>>>>> mount -t ocfs2 /dev/vdd /mnt
>>>>>>>
>>>>>>> node2:
>>>>>>> for i in {1..100}; do
>>>>>>> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
>>>>>>> sleep 3;
>>>>>>> echo "umount"; umount /mnt;
>>>>>>> done
>>>>>>> ```
>>>>>>>
>>>>>> Try set one node with node number 0 and mount it there first. I used o2cb stack.
>>>>> Could you show more test info/steps. I can't follow your meaning.
>>>>> How to set up a node with a fix node number?
>>>>> With my understanding, under pcmk env, the first mounted node will auto got node
>>>>> number 1 (or any value great than 0). and there is no place to set node number
>>>>> by hand. It's very likely you mixed to use nocluster & cluster mount.
>>>>> If my suspect right (mixed mount), your use case is wrong.
>>>>
>>>> Did you check my last mail? I already said i didn't do mixed mount, only cluster mount.
>>>
>>> I carefully read every word of your mails. we are in different world. (pcmk vs o2cb)
>>> In pcmk env, slot number always great than 0. (I also maintain cluster-md in suse,
>>> in slot_number@drivers/md/md-cluster.c, you can see the number never ZERO).
>>>
>>>>
>>>> There is a configure file for o2cb, you can just set node number to 0, please check https://docs.oracle.com/en/operating-systems/oracle-linux/7/fsadmin/ol7-ocfs2.html#ol7-config-file-ocfs2
>>>
>>> Thank you for sharing. I will read & learn it.
>>>
>>>>
>>>>>
>>>>>>> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
>>>>>>> I want to find whether there is a idear way to fix the hung issue.
>>>>>>>
>>>>>>>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
>>>>>>>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
>>>>>>>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>>>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
>>>>>>>> [13148.749354] Call Trace:
>>>>>>>> ...
>>>>>>>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
>>>>>>>> introduced the feature to mount ocfs2 locally even it is cluster based,
>>>>>>>> that is a very dangerous, it can easily cause serious data corruption,
>>>>>>>> there is no way to stop other nodes mounting the fs and corrupting it.
>>>>>>> I can't follow your meaning. When users want to use local mount feature, they MUST know
>>>>>>> what they are doing, and how to use it.
>>>>>> I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
>>>>> yes, no one reads doc by default.
>>>>>
>>>>> currently, mount with option 'nocluster' will show special info to user:
>>>>>
>>>>> ```
>>>>> # mount -t ocfs2 -o nocluster /dev/vdd /mnt
>>>>> Warning: to mount a clustered volume without the cluster stack.
>>>>> Please make sure you only mount the file system from one node.
>>>>> Otherwise, the file system may be damaged.
>>>>> Proceed (y/N):
>>>>> ```
>>>>>
>>>>>> It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
>>>>> This feature is not enabled by default, and also shows enough info/warn before executing.
>>>>> I give (may awkward) another example:
>>>>> nocluster mount likes executing command 'rm -rf /', do you think we should
>>>>> tell/educate customer do not execute it?
>>>>
>>>> That's totally out of domain of ocfs2, it's not ocfs2 developer's job to tell customer not doing that.
>>>>
>>>> Here you provided a ocfs2 feature that can easily corrupt ocfs2.
>>>
>>> First, this hung issue or any other related issues can be fixed. I have already
>>> described a method in my previous mail. (use es_reserved1[0] of ocfs2_extended_slot)
>>>
>>> Second, this feature have been merged two years, only you reported a hung issue.
>>> Our customer also uses it for 2 years, no bug reported from them. it means,
>>> at least, this feature fine works in pcmk stack.
>>>
>>>>
>>>> As an ocfs2 developer, you should make sure ocfs2 was not corrupted even customer did something bad. That's why mkfs.ocfs2/fsck.ocfs2 check whether ocfs2 volume is mounted in the cluster before changing anything.
>>>
>>> fsck.ocfs2 with '-F' could work in noclustered env.
>>> In 2005, commit 44c97d6ce8baeb4a6c37712d4c22d0702ebf7714 introduced this feature.
>>> This year, commit 7085e9177adc7197250d872c50a05dfc9c531bdc enhanced it,
>>> which could make fsck.ocfs2 totally work in noclustered env.
>>> (mkfs.ocfs2 can also work in local mount mode which is another story)
>>>
>>>>
>>>>>
>>>>> The nocluster mount feature was designed to resolve customer pain point from real world:
>>>>> SUSE HA stack uses pacemaker+corosync+fsdlm+ocfs2, which complicates/inconveniences
>>>>> to set up. and need to install dozens of related packages.
>>>>>
>>>>> The nocluster feature main use case:
>>>>> customer wants to avoid to set up HA stack, but they wants to check ocfs2 volume
>>>>> or do backup volume.
>>>> That doesn't mean you have to do this in kernel. Customer had a pain to setup HA stack, you should develop some script/app to make it easy.
>>>
>>> It's not a simple job to develop some script/app to help setup ha stack.
>>> Both SUSE and Red Hat have special team to do this. In SUSE, this team have worked many years.
>>> If any one can create an easy/powerful HA auto setup tools, He can even found a company
>>> to sell this software.
>>>
>>>>>
>>>>> In my opinion, we should make ocfs2 more powerful and include more useful features for users.
>>>>> If there are some problems related new feature, we should do our best to fix it not revert it.
>>>>
>>>> Only good/safe features, i don't think this one is qualified. Also no one give a reviewed-by to this commit, i am not sure how it was merged.
>>>
>>> I had shared my idea about how to fix this hung issue. it's not a big bug.
>>> More useful feature could attract more users, it will make ocfs2 community more powerful.
>>>
>>>>
>>>> Joseph, what's your call on this?
>>>
>>> me too, wait for maintainer feedback.
>>>
>>
>> Seems I am missing some mails for this thread.
>> The 'nocluster' mount is introduced by Gang and I think it has real
>> user scenarios. I am curious about since node 0 is commonly used in
>> o2cb, why there is no any bug report before.
>> So let's try to fix the regression first.
> Real user case doesn’t mean this has to been done through kernel? This sounds like doing something in kernel that is to workaround some issue that can be done from user space.
> I didn’t see a Reviewed-by for the patch, how did it get merged?
> 

Gang had left SUSE for some time, and busy with his new job.
I have vague memory, he said this commit approved & merged directly by Andrew Morton.
Gang dedicated to contribute ocfs2 community many years, and set up his competence
to other maintainers & reviewers.

If Junxiao dislike this feature, and don't want to fix it as a bug.
I am willing to file a patch.

/Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-07  2:07     ` Joseph Qi via Ocfs2-devel
@ 2022-06-07  2:21       ` Junxiao Bi via Ocfs2-devel
  2022-06-07  2:38         ` heming.zhao--- via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: Junxiao Bi via Ocfs2-devel @ 2022-06-07  2:21 UTC (permalink / raw)
  To: Joseph Qi; +Cc: ocfs2-devel




> 在 2022年6月6日,下午7:07,Joseph Qi <joseph.qi@linux.alibaba.com> 写道:
> 
> 
> 
>> On 6/7/22 7:50 AM, heming.zhao@suse.com wrote:
>>> On 6/7/22 00:15, Junxiao Bi wrote:
>>>> On 6/5/22 7:08 PM, heming.zhao@suse.com wrote:
>>> 
>>>> Hello Junxiao,
>>>> 
>>>> First of all, let's turn to the same channel to discuss your patch.
>>>> There are two features: 'local mount' & 'nocluster mount'.
>>>> I mistakenly wrote local-mount on some place in previous mails.
>>>> This patch revert commit 912f655d78c5d4, which is related with 'nocluster mount'.
>>>> 
>>>> 
>>>> On 6/5/22 00:19, Junxiao Bi wrote:
>>>>> 
>>>>>> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
>>>>>> 
>>>>>> Hello Junxiao,
>>>>>> 
>>>>>>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
>>>>>>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
>>>>>>> This commit introduced a regression that can cause mount hung.
>>>>>>> The changes in __ocfs2_find_empty_slot causes that any node with
>>>>>>> none-zero node number can grab the slot that was already taken by
>>>>>>> node 0, so node 1 will access the same journal with node 0, when it
>>>>>>> try to grab journal cluster lock, it will hung because it was already
>>>>>>> acquired by node 0.
>>>>>>> It's very easy to reproduce this, in one cluster, mount node 0 first,
>>>>>>> then node 1, you will see the following call trace from node 1.
>>>>>>   From your description, it looks your env mixed local-mount & clustered-mount.
>>>>> No, only cluster mount.
>>>>>> Could you mind to share your test/reproducible steps.
>>>>>> And which ha stack do you use, pmck or o2cb?
>>>>>> 
>>>>>> I failed to reproduce it, my test steps (with pcmk stack):
>>>>>> ```
>>>>>> node1:
>>>>>> mount -t ocfs2 /dev/vdd /mnt
>>>>>> 
>>>>>> node2:
>>>>>> for i in {1..100}; do
>>>>>> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
>>>>>> sleep 3;
>>>>>> echo "umount"; umount /mnt;
>>>>>> done
>>>>>> ```
>>>>>> 
>>>>> Try set one node with node number 0 and mount it there first. I used o2cb stack.
>>>> Could you show more test info/steps. I can't follow your meaning.
>>>> How to set up a node with a fix node number?
>>>> With my understanding, under pcmk env, the first mounted node will auto got node
>>>> number 1 (or any value great than 0). and there is no place to set node number
>>>> by hand. It's very likely you mixed to use nocluster & cluster mount.
>>>> If my suspect right (mixed mount), your use case is wrong.
>>> 
>>> Did you check my last mail? I already said i didn't do mixed mount, only cluster mount.
>> 
>> I carefully read every word of your mails. we are in different world. (pcmk vs o2cb)
>> In pcmk env, slot number always great than 0. (I also maintain cluster-md in suse,
>> in slot_number@drivers/md/md-cluster.c, you can see the number never ZERO).
>> 
>>> 
>>> There is a configure file for o2cb, you can just set node number to 0, please check https://docs.oracle.com/en/operating-systems/oracle-linux/7/fsadmin/ol7-ocfs2.html#ol7-config-file-ocfs2
>> 
>> Thank you for sharing. I will read & learn it.
>> 
>>> 
>>>> 
>>>>>> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
>>>>>> I want to find whether there is a idear way to fix the hung issue.
>>>>>> 
>>>>>>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
>>>>>>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
>>>>>>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
>>>>>>> [13148.749354] Call Trace:
>>>>>>> ...
>>>>>>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
>>>>>>> introduced the feature to mount ocfs2 locally even it is cluster based,
>>>>>>> that is a very dangerous, it can easily cause serious data corruption,
>>>>>>> there is no way to stop other nodes mounting the fs and corrupting it.
>>>>>> I can't follow your meaning. When users want to use local mount feature, they MUST know
>>>>>> what they are doing, and how to use it.
>>>>> I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
>>>> yes, no one reads doc by default.
>>>> 
>>>> currently, mount with option 'nocluster' will show special info to user:
>>>> 
>>>> ```
>>>> # mount -t ocfs2 -o nocluster /dev/vdd /mnt
>>>> Warning: to mount a clustered volume without the cluster stack.
>>>> Please make sure you only mount the file system from one node.
>>>> Otherwise, the file system may be damaged.
>>>> Proceed (y/N):
>>>> ```
>>>> 
>>>>> It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
>>>> This feature is not enabled by default, and also shows enough info/warn before executing.
>>>> I give (may awkward) another example:
>>>> nocluster mount likes executing command 'rm -rf /', do you think we should
>>>> tell/educate customer do not execute it?
>>> 
>>> That's totally out of domain of ocfs2, it's not ocfs2 developer's job to tell customer not doing that.
>>> 
>>> Here you provided a ocfs2 feature that can easily corrupt ocfs2.
>> 
>> First, this hung issue or any other related issues can be fixed. I have already
>> described a method in my previous mail. (use es_reserved1[0] of ocfs2_extended_slot)
>> 
>> Second, this feature have been merged two years, only you reported a hung issue.
>> Our customer also uses it for 2 years, no bug reported from them. it means,
>> at least, this feature fine works in pcmk stack.
>> 
>>> 
>>> As an ocfs2 developer, you should make sure ocfs2 was not corrupted even customer did something bad. That's why mkfs.ocfs2/fsck.ocfs2 check whether ocfs2 volume is mounted in the cluster before changing anything.
>> 
>> fsck.ocfs2 with '-F' could work in noclustered env.
>> In 2005, commit 44c97d6ce8baeb4a6c37712d4c22d0702ebf7714 introduced this feature.
>> This year, commit 7085e9177adc7197250d872c50a05dfc9c531bdc enhanced it,
>> which could make fsck.ocfs2 totally work in noclustered env.
>> (mkfs.ocfs2 can also work in local mount mode which is another story)
>> 
>>> 
>>>> 
>>>> The nocluster mount feature was designed to resolve customer pain point from real world:
>>>> SUSE HA stack uses pacemaker+corosync+fsdlm+ocfs2, which complicates/inconveniences
>>>> to set up. and need to install dozens of related packages.
>>>> 
>>>> The nocluster feature main use case:
>>>> customer wants to avoid to set up HA stack, but they wants to check ocfs2 volume
>>>> or do backup volume.
>>> That doesn't mean you have to do this in kernel. Customer had a pain to setup HA stack, you should develop some script/app to make it easy.
>> 
>> It's not a simple job to develop some script/app to help setup ha stack.
>> Both SUSE and Red Hat have special team to do this. In SUSE, this team have worked many years.
>> If any one can create an easy/powerful HA auto setup tools, He can even found a company
>> to sell this software.
>> 
>>>> 
>>>> In my opinion, we should make ocfs2 more powerful and include more useful features for users.
>>>> If there are some problems related new feature, we should do our best to fix it not revert it.
>>> 
>>> Only good/safe features, i don't think this one is qualified. Also no one give a reviewed-by to this commit, i am not sure how it was merged.
>> 
>> I had shared my idea about how to fix this hung issue. it's not a big bug.
>> More useful feature could attract more users, it will make ocfs2 community more powerful.
>> 
>>> 
>>> Joseph, what's your call on this?
>> 
>> me too, wait for maintainer feedback.
>> 
> 
> Seems I am missing some mails for this thread.
> The 'nocluster' mount is introduced by Gang and I think it has real
> user scenarios. I am curious about since node 0 is commonly used in
> o2cb, why there is no any bug report before.
> So let's try to fix the regression first.
Real user case doesn’t mean this has to been done through kernel? This sounds like doing something in kernel that is to workaround some issue that can be done from user space.
I didn’t see a Reviewed-by for the patch, how did it get merged?

Thanks,
Junxiao
> 
> Thanks,
> Joseph
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-06 23:50   ` heming.zhao--- via Ocfs2-devel
  2022-06-07  0:32     ` Heming Zhao via Ocfs2-devel
@ 2022-06-07  2:07     ` Joseph Qi via Ocfs2-devel
  2022-06-07  2:21       ` Junxiao Bi via Ocfs2-devel
  1 sibling, 1 reply; 15+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-06-07  2:07 UTC (permalink / raw)
  To: heming.zhao, Junxiao Bi; +Cc: ocfs2-devel



On 6/7/22 7:50 AM, heming.zhao@suse.com wrote:
> On 6/7/22 00:15, Junxiao Bi wrote:
>> On 6/5/22 7:08 PM, heming.zhao@suse.com wrote:
>>
>>> Hello Junxiao,
>>>
>>> First of all, let's turn to the same channel to discuss your patch.
>>> There are two features: 'local mount' & 'nocluster mount'.
>>> I mistakenly wrote local-mount on some place in previous mails.
>>> This patch revert commit 912f655d78c5d4, which is related with 'nocluster mount'.
>>>
>>>
>>> On 6/5/22 00:19, Junxiao Bi wrote:
>>>>
>>>>> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
>>>>>
>>>>> Hello Junxiao,
>>>>>
>>>>>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
>>>>>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
>>>>>> This commit introduced a regression that can cause mount hung.
>>>>>> The changes in __ocfs2_find_empty_slot causes that any node with
>>>>>> none-zero node number can grab the slot that was already taken by
>>>>>> node 0, so node 1 will access the same journal with node 0, when it
>>>>>> try to grab journal cluster lock, it will hung because it was already
>>>>>> acquired by node 0.
>>>>>> It's very easy to reproduce this, in one cluster, mount node 0 first,
>>>>>> then node 1, you will see the following call trace from node 1.
>>>>>   From your description, it looks your env mixed local-mount & clustered-mount.
>>>> No, only cluster mount.
>>>>> Could you mind to share your test/reproducible steps.
>>>>> And which ha stack do you use, pmck or o2cb?
>>>>>
>>>>> I failed to reproduce it, my test steps (with pcmk stack):
>>>>> ```
>>>>> node1:
>>>>> mount -t ocfs2 /dev/vdd /mnt
>>>>>
>>>>> node2:
>>>>> for i in {1..100}; do
>>>>> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
>>>>> sleep 3;
>>>>> echo "umount"; umount /mnt;
>>>>> done
>>>>> ```
>>>>>
>>>> Try set one node with node number 0 and mount it there first. I used o2cb stack.
>>> Could you show more test info/steps. I can't follow your meaning.
>>> How to set up a node with a fix node number?
>>> With my understanding, under pcmk env, the first mounted node will auto got node
>>> number 1 (or any value great than 0). and there is no place to set node number
>>> by hand. It's very likely you mixed to use nocluster & cluster mount.
>>> If my suspect right (mixed mount), your use case is wrong.
>>
>> Did you check my last mail? I already said i didn't do mixed mount, only cluster mount.
> 
> I carefully read every word of your mails. we are in different world. (pcmk vs o2cb)
> In pcmk env, slot number always great than 0. (I also maintain cluster-md in suse,
> in slot_number@drivers/md/md-cluster.c, you can see the number never ZERO).
> 
>>
>> There is a configure file for o2cb, you can just set node number to 0, please check https://docs.oracle.com/en/operating-systems/oracle-linux/7/fsadmin/ol7-ocfs2.html#ol7-config-file-ocfs2
> 
> Thank you for sharing. I will read & learn it.
> 
>>
>>>
>>>>> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
>>>>> I want to find whether there is a idear way to fix the hung issue.
>>>>>
>>>>>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
>>>>>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
>>>>>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
>>>>>> [13148.749354] Call Trace:
>>>>>> ...
>>>>>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
>>>>>> introduced the feature to mount ocfs2 locally even it is cluster based,
>>>>>> that is a very dangerous, it can easily cause serious data corruption,
>>>>>> there is no way to stop other nodes mounting the fs and corrupting it.
>>>>> I can't follow your meaning. When users want to use local mount feature, they MUST know
>>>>> what they are doing, and how to use it.
>>>> I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
>>> yes, no one reads doc by default.
>>>
>>> currently, mount with option 'nocluster' will show special info to user:
>>>
>>> ```
>>> # mount -t ocfs2 -o nocluster /dev/vdd /mnt
>>> Warning: to mount a clustered volume without the cluster stack.
>>> Please make sure you only mount the file system from one node.
>>> Otherwise, the file system may be damaged.
>>> Proceed (y/N):
>>> ```
>>>
>>>> It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
>>> This feature is not enabled by default, and also shows enough info/warn before executing.
>>> I give (may awkward) another example:
>>> nocluster mount likes executing command 'rm -rf /', do you think we should
>>> tell/educate customer do not execute it?
>>
>> That's totally out of domain of ocfs2, it's not ocfs2 developer's job to tell customer not doing that.
>>
>> Here you provided a ocfs2 feature that can easily corrupt ocfs2.
> 
> First, this hung issue or any other related issues can be fixed. I have already
> described a method in my previous mail. (use es_reserved1[0] of ocfs2_extended_slot)
> 
> Second, this feature have been merged two years, only you reported a hung issue.
> Our customer also uses it for 2 years, no bug reported from them. it means,
> at least, this feature fine works in pcmk stack.
> 
>>
>> As an ocfs2 developer, you should make sure ocfs2 was not corrupted even customer did something bad. That's why mkfs.ocfs2/fsck.ocfs2 check whether ocfs2 volume is mounted in the cluster before changing anything.
> 
> fsck.ocfs2 with '-F' could work in noclustered env.
> In 2005, commit 44c97d6ce8baeb4a6c37712d4c22d0702ebf7714 introduced this feature.
> This year, commit 7085e9177adc7197250d872c50a05dfc9c531bdc enhanced it,
> which could make fsck.ocfs2 totally work in noclustered env.
> (mkfs.ocfs2 can also work in local mount mode which is another story)
> 
>>
>>>
>>> The nocluster mount feature was designed to resolve customer pain point from real world:
>>> SUSE HA stack uses pacemaker+corosync+fsdlm+ocfs2, which complicates/inconveniences
>>> to set up. and need to install dozens of related packages.
>>>
>>> The nocluster feature main use case:
>>> customer wants to avoid to set up HA stack, but they wants to check ocfs2 volume
>>> or do backup volume.
>> That doesn't mean you have to do this in kernel. Customer had a pain to setup HA stack, you should develop some script/app to make it easy.
> 
> It's not a simple job to develop some script/app to help setup ha stack.
> Both SUSE and Red Hat have special team to do this. In SUSE, this team have worked many years.
> If any one can create an easy/powerful HA auto setup tools, He can even found a company
> to sell this software.
> 
>>>
>>> In my opinion, we should make ocfs2 more powerful and include more useful features for users.
>>> If there are some problems related new feature, we should do our best to fix it not revert it.
>>
>> Only good/safe features, i don't think this one is qualified. Also no one give a reviewed-by to this commit, i am not sure how it was merged.
> 
> I had shared my idea about how to fix this hung issue. it's not a big bug.
> More useful feature could attract more users, it will make ocfs2 community more powerful.
> 
>>
>> Joseph, what's your call on this?
> 
> me too, wait for maintainer feedback.
> 

Seems I am missing some mails for this thread.
The 'nocluster' mount is introduced by Gang and I think it has real
user scenarios. I am curious about since node 0 is commonly used in
o2cb, why there is no any bug report before.
So let's try to fix the regression first.

Thanks,
Joseph

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-06 23:50   ` heming.zhao--- via Ocfs2-devel
@ 2022-06-07  0:32     ` Heming Zhao via Ocfs2-devel
  2022-06-07  2:07     ` Joseph Qi via Ocfs2-devel
  1 sibling, 0 replies; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-06-07  0:32 UTC (permalink / raw)
  To: Junxiao Bi, Joseph Qi; +Cc: ocfs2-devel

On Tue, Jun 07, 2022 at 07:50:21AM +0800, heming.zhao@suse.com wrote:
> On 6/7/22 00:15, Junxiao Bi wrote:
> > On 6/5/22 7:08 PM, heming.zhao@suse.com wrote:
> > 
> > > Hello Junxiao,
> > > 
> > > First of all, let's turn to the same channel to discuss your patch.
> > > There are two features: 'local mount' & 'nocluster mount'.
> > > I mistakenly wrote local-mount on some place in previous mails.
> > > This patch revert commit 912f655d78c5d4, which is related with 'nocluster mount'.
> > > 
> > > 
> > > On 6/5/22 00:19, Junxiao Bi wrote:
> > > > 
> > > > > 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
> > > > > 
> > > > > Hello Junxiao,
> > > > > 
> > > > > > On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
> > > > > > This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
> > > > > > This commit introduced a regression that can cause mount hung.
> > > > > > The changes in __ocfs2_find_empty_slot causes that any node with
> > > > > > none-zero node number can grab the slot that was already taken by
> > > > > > node 0, so node 1 will access the same journal with node 0, when it
> > > > > > try to grab journal cluster lock, it will hung because it was already
> > > > > > acquired by node 0.
> > > > > > It's very easy to reproduce this, in one cluster, mount node 0 first,
> > > > > > then node 1, you will see the following call trace from node 1.
> > > > >   From your description, it looks your env mixed local-mount & clustered-mount.
> > > > No, only cluster mount.
> > > > > Could you mind to share your test/reproducible steps.
> > > > > And which ha stack do you use, pmck or o2cb?
> > > > > 
> > > > > I failed to reproduce it, my test steps (with pcmk stack):
> > > > > ```
> > > > > node1:
> > > > > mount -t ocfs2 /dev/vdd /mnt
> > > > > 
> > > > > node2:
> > > > > for i in {1..100}; do
> > > > > echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
> > > > > sleep 3;
> > > > > echo "umount"; umount /mnt;
> > > > > done
> > > > > ```
> > > > > 
> > > > Try set one node with node number 0 and mount it there first. I used o2cb stack.
> > > Could you show more test info/steps. I can't follow your meaning.
> > > How to set up a node with a fix node number?
> > > With my understanding, under pcmk env, the first mounted node will auto got node
> > > number 1 (or any value great than 0). and there is no place to set node number
> > > by hand. It's very likely you mixed to use nocluster & cluster mount.
> > > If my suspect right (mixed mount), your use case is wrong.
> > 
> > Did you check my last mail? I already said i didn't do mixed mount, only cluster mount.
> 
> I carefully read every word of your mails. we are in different world. (pcmk vs o2cb)
> In pcmk env, slot number always great than 0. (I also maintain cluster-md in suse,
> in slot_number@drivers/md/md-cluster.c, you can see the number never ZERO).
> 
> > 
> > There is a configure file for o2cb, you can just set node number to 0, please check https://docs.oracle.com/en/operating-systems/oracle-linux/7/fsadmin/ol7-ocfs2.html#ol7-config-file-ocfs2
> 
> Thank you for sharing. I will read & learn it.
> 
> > 
> > > 
> > > > > This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
> > > > > I want to find whether there is a idear way to fix the hung issue.
> > > > > 
> > > > > > [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
> > > > > > [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
> > > > > > [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > > > > > [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
> > > > > > [13148.749354] Call Trace:
> > > > > > ...
> > > > > > To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
> > > > > > introduced the feature to mount ocfs2 locally even it is cluster based,
> > > > > > that is a very dangerous, it can easily cause serious data corruption,
> > > > > > there is no way to stop other nodes mounting the fs and corrupting it.
> > > > > I can't follow your meaning. When users want to use local mount feature, they MUST know
> > > > > what they are doing, and how to use it.
> > > > I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
> > > yes, no one reads doc by default.
> > > 
> > > currently, mount with option 'nocluster' will show special info to user:
> > > 
> > > ```
> > > # mount -t ocfs2 -o nocluster /dev/vdd /mnt
> > > Warning: to mount a clustered volume without the cluster stack.
> > > Please make sure you only mount the file system from one node.
> > > Otherwise, the file system may be damaged.
> > > Proceed (y/N):
> > > ```
> > > 
> > > > It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
> > > This feature is not enabled by default, and also shows enough info/warn before executing.
> > > I give (may awkward) another example:
> > > nocluster mount likes executing command 'rm -rf /', do you think we should
> > > tell/educate customer do not execute it?
> > 
> > That's totally out of domain of ocfs2, it's not ocfs2 developer's job to tell customer not doing that.
> > 
> > Here you provided a ocfs2 feature that can easily corrupt ocfs2.
> 
> First, this hung issue or any other related issues can be fixed. I have already
> described a method in my previous mail. (use es_reserved1[0] of ocfs2_extended_slot)
> 
> Second, this feature have been merged two years, only you reported a hung issue.
> Our customer also uses it for 2 years, no bug reported from them. it means,
> at least, this feature fine works in pcmk stack.
> 
> > 
> > As an ocfs2 developer, you should make sure ocfs2 was not corrupted even customer did something bad. That's why mkfs.ocfs2/fsck.ocfs2 check whether ocfs2 volume is mounted in the cluster before changing anything.
> 
> fsck.ocfs2 with '-F' could work in noclustered env.
> In 2005, commit 44c97d6ce8baeb4a6c37712d4c22d0702ebf7714 introduced this feature.
> This year, commit 7085e9177adc7197250d872c50a05dfc9c531bdc enhanced it,
> which could make fsck.ocfs2 totally work in noclustered env.
> (mkfs.ocfs2 can also work in local mount mode which is another story)
> 
> > 
> > > 
> > > The nocluster mount feature was designed to resolve customer pain point from real world:
> > > SUSE HA stack uses pacemaker+corosync+fsdlm+ocfs2, which complicates/inconveniences
> > > to set up. and need to install dozens of related packages.
> > > 
> > > The nocluster feature main use case:
> > > customer wants to avoid to set up HA stack, but they wants to check ocfs2 volume
> > > or do backup volume.
> > That doesn't mean you have to do this in kernel. Customer had a pain to setup HA stack, you should develop some script/app to make it easy.
> 
> It's not a simple job to develop some script/app to help setup ha stack.
> Both SUSE and Red Hat have special team to do this. In SUSE, this team have worked many years.
> If any one can create an easy/powerful HA auto setup tools, He can even found a company
> to sell this software.
> 
> > > 
> > > In my opinion, we should make ocfs2 more powerful and include more useful features for users.
> > > If there are some problems related new feature, we should do our best to fix it not revert it.
> > 
> > Only good/safe features, i don't think this one is qualified. Also no one give a reviewed-by to this commit, i am not sure how it was merged.
> 
> I had shared my idea about how to fix this hung issue. it's not a big bug.
> More useful feature could attract more users, it will make ocfs2 community more powerful.
> 
> > 
> > Joseph, what's your call on this?
> 
> me too, wait for maintainer feedback.
> 

I give another example to support keeping the nocluster feature.
gfs2, another competition fs for ocfs2, which also supports nolock mode for running.

below is from fs/gfs2/Kconfig:

```
config GFS2_FS
    tristate "GFS2 file system support"
    select FS_POSIX_ACL
    select CRC32
    select LIBCRC32C
    select QUOTACTL
    select FS_IOMAP
    help
      A cluster filesystem.

      ... ...

      The "nolock" lock module is now built in to GFS2 by default. If
      you want to use the DLM, be sure to enable IPv4/6 networking.
```

/Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-06 16:15 ` Junxiao Bi via Ocfs2-devel
@ 2022-06-06 23:50   ` heming.zhao--- via Ocfs2-devel
  2022-06-07  0:32     ` Heming Zhao via Ocfs2-devel
  2022-06-07  2:07     ` Joseph Qi via Ocfs2-devel
  0 siblings, 2 replies; 15+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-06-06 23:50 UTC (permalink / raw)
  To: Junxiao Bi, Joseph Qi; +Cc: ocfs2-devel

On 6/7/22 00:15, Junxiao Bi wrote:
> On 6/5/22 7:08 PM, heming.zhao@suse.com wrote:
> 
>> Hello Junxiao,
>>
>> First of all, let's turn to the same channel to discuss your patch.
>> There are two features: 'local mount' & 'nocluster mount'.
>> I mistakenly wrote local-mount on some place in previous mails.
>> This patch revert commit 912f655d78c5d4, which is related with 'nocluster mount'.
>>
>>
>> On 6/5/22 00:19, Junxiao Bi wrote:
>>>
>>>> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
>>>>
>>>> Hello Junxiao,
>>>>
>>>>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
>>>>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
>>>>> This commit introduced a regression that can cause mount hung.
>>>>> The changes in __ocfs2_find_empty_slot causes that any node with
>>>>> none-zero node number can grab the slot that was already taken by
>>>>> node 0, so node 1 will access the same journal with node 0, when it
>>>>> try to grab journal cluster lock, it will hung because it was already
>>>>> acquired by node 0.
>>>>> It's very easy to reproduce this, in one cluster, mount node 0 first,
>>>>> then node 1, you will see the following call trace from node 1.
>>>>   From your description, it looks your env mixed local-mount & clustered-mount.
>>> No, only cluster mount.
>>>> Could you mind to share your test/reproducible steps.
>>>> And which ha stack do you use, pmck or o2cb?
>>>>
>>>> I failed to reproduce it, my test steps (with pcmk stack):
>>>> ```
>>>> node1:
>>>> mount -t ocfs2 /dev/vdd /mnt
>>>>
>>>> node2:
>>>> for i in {1..100}; do
>>>> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
>>>> sleep 3;
>>>> echo "umount"; umount /mnt;
>>>> done
>>>> ```
>>>>
>>> Try set one node with node number 0 and mount it there first. I used o2cb stack.
>> Could you show more test info/steps. I can't follow your meaning.
>> How to set up a node with a fix node number?
>> With my understanding, under pcmk env, the first mounted node will auto got node
>> number 1 (or any value great than 0). and there is no place to set node number
>> by hand. It's very likely you mixed to use nocluster & cluster mount.
>> If my suspect right (mixed mount), your use case is wrong.
> 
> Did you check my last mail? I already said i didn't do mixed mount, only cluster mount.

I carefully read every word of your mails. we are in different world. (pcmk vs o2cb)
In pcmk env, slot number always great than 0. (I also maintain cluster-md in suse,
in slot_number@drivers/md/md-cluster.c, you can see the number never ZERO).

> 
> There is a configure file for o2cb, you can just set node number to 0, please check https://docs.oracle.com/en/operating-systems/oracle-linux/7/fsadmin/ol7-ocfs2.html#ol7-config-file-ocfs2

Thank you for sharing. I will read & learn it.

> 
>>
>>>> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
>>>> I want to find whether there is a idear way to fix the hung issue.
>>>>
>>>>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
>>>>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
>>>>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
>>>>> [13148.749354] Call Trace:
>>>>> ...
>>>>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
>>>>> introduced the feature to mount ocfs2 locally even it is cluster based,
>>>>> that is a very dangerous, it can easily cause serious data corruption,
>>>>> there is no way to stop other nodes mounting the fs and corrupting it.
>>>> I can't follow your meaning. When users want to use local mount feature, they MUST know
>>>> what they are doing, and how to use it.
>>> I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
>> yes, no one reads doc by default.
>>
>> currently, mount with option 'nocluster' will show special info to user:
>>
>> ```
>> # mount -t ocfs2 -o nocluster /dev/vdd /mnt
>> Warning: to mount a clustered volume without the cluster stack.
>> Please make sure you only mount the file system from one node.
>> Otherwise, the file system may be damaged.
>> Proceed (y/N):
>> ```
>>
>>> It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
>> This feature is not enabled by default, and also shows enough info/warn before executing.
>> I give (may awkward) another example:
>> nocluster mount likes executing command 'rm -rf /', do you think we should
>> tell/educate customer do not execute it?
> 
> That's totally out of domain of ocfs2, it's not ocfs2 developer's job to tell customer not doing that.
> 
> Here you provided a ocfs2 feature that can easily corrupt ocfs2.

First, this hung issue or any other related issues can be fixed. I have already
described a method in my previous mail. (use es_reserved1[0] of ocfs2_extended_slot)

Second, this feature have been merged two years, only you reported a hung issue.
Our customer also uses it for 2 years, no bug reported from them. it means,
at least, this feature fine works in pcmk stack.

> 
> As an ocfs2 developer, you should make sure ocfs2 was not corrupted even customer did something bad. That's why mkfs.ocfs2/fsck.ocfs2 check whether ocfs2 volume is mounted in the cluster before changing anything.

fsck.ocfs2 with '-F' could work in noclustered env.
In 2005, commit 44c97d6ce8baeb4a6c37712d4c22d0702ebf7714 introduced this feature.
This year, commit 7085e9177adc7197250d872c50a05dfc9c531bdc enhanced it,
which could make fsck.ocfs2 totally work in noclustered env.
(mkfs.ocfs2 can also work in local mount mode which is another story)

> 
>>
>> The nocluster mount feature was designed to resolve customer pain point from real world:
>> SUSE HA stack uses pacemaker+corosync+fsdlm+ocfs2, which complicates/inconveniences
>> to set up. and need to install dozens of related packages.
>>
>> The nocluster feature main use case:
>> customer wants to avoid to set up HA stack, but they wants to check ocfs2 volume
>> or do backup volume.
> That doesn't mean you have to do this in kernel. Customer had a pain to setup HA stack, you should develop some script/app to make it easy.

It's not a simple job to develop some script/app to help setup ha stack.
Both SUSE and Red Hat have special team to do this. In SUSE, this team have worked many years.
If any one can create an easy/powerful HA auto setup tools, He can even found a company
to sell this software.

>>
>> In my opinion, we should make ocfs2 more powerful and include more useful features for users.
>> If there are some problems related new feature, we should do our best to fix it not revert it.
> 
> Only good/safe features, i don't think this one is qualified. Also no one give a reviewed-by to this commit, i am not sure how it was merged.

I had shared my idea about how to fix this hung issue. it's not a big bug.
More useful feature could attract more users, it will make ocfs2 community more powerful.

> 
> Joseph, what's your call on this?

me too, wait for maintainer feedback.

/Heming
> 
> Thanks,
> 
> Junxiao.
> 
>>>>   From mount.ocfs2 (8), there also writes *only* mount fs on *one* node at the same time.
>>>> And also tell user fs will be damaged under wrong action.
>>>>
>>>> ```
>>>> nocluster
>>>>
>>>>    This  option  allows  users  to  mount a clustered volume without configuring the cluster
>>>>    stack.  However, you must be aware that you can only mount the file system from one  node
>>>>    at the same time, otherwise, the file system may be damaged. Please use it with caution.
>>>> ```
>>>>
>>>>> Setup ha or other cluster-aware stack is just the cost that we have to
>>>>> take for avoiding corruption, otherwise we have to do it in kernel.
>>>> It's a little bit serious to totally revert this commit just under lacking sanity
>>>> check. If you or maintainer think the local mount should do more jobs to prevent mix
>>>> local-mount and clustered-mount scenario, we could add more sanity check during
>>>> local mounting.
>>> I don’t think this should be done in kernel. Setup cluster stack is the way to forward.
>>>
>> my mistake: all above 'local mount' should be 'nocluster mount'.
>>
>> At last, let's totally understand your use case (or reproduce your hung issue).
>>
>> Thanks,
>> Heming
>>
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-06  2:08 heming.zhao--- via Ocfs2-devel
  2022-06-06  8:27 ` Heming Zhao via Ocfs2-devel
@ 2022-06-06 16:15 ` Junxiao Bi via Ocfs2-devel
  2022-06-06 23:50   ` heming.zhao--- via Ocfs2-devel
  1 sibling, 1 reply; 15+ messages in thread
From: Junxiao Bi via Ocfs2-devel @ 2022-06-06 16:15 UTC (permalink / raw)
  To: heming.zhao, Joseph Qi; +Cc: ocfs2-devel

On 6/5/22 7:08 PM, heming.zhao@suse.com wrote:

> Hello Junxiao,
>
> First of all, let's turn to the same channel to discuss your patch.
> There are two features: 'local mount' & 'nocluster mount'.
> I mistakenly wrote local-mount on some place in previous mails.
> This patch revert commit 912f655d78c5d4, which is related with 'nocluster mount'.
>
>
> On 6/5/22 00:19, Junxiao Bi wrote:
>>
>>> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
>>>
>>> Hello Junxiao,
>>>
>>>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
>>>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
>>>> This commit introduced a regression that can cause mount hung.
>>>> The changes in __ocfs2_find_empty_slot causes that any node with
>>>> none-zero node number can grab the slot that was already taken by
>>>> node 0, so node 1 will access the same journal with node 0, when it
>>>> try to grab journal cluster lock, it will hung because it was already
>>>> acquired by node 0.
>>>> It's very easy to reproduce this, in one cluster, mount node 0 first,
>>>> then node 1, you will see the following call trace from node 1.
>>>   From your description, it looks your env mixed local-mount & clustered-mount.
>> No, only cluster mount.
>>> Could you mind to share your test/reproducible steps.
>>> And which ha stack do you use, pmck or o2cb?
>>>
>>> I failed to reproduce it, my test steps (with pcmk stack):
>>> ```
>>> node1:
>>> mount -t ocfs2 /dev/vdd /mnt
>>>
>>> node2:
>>> for i in {1..100}; do
>>> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
>>> sleep 3;
>>> echo "umount"; umount /mnt;
>>> done
>>> ```
>>>
>> Try set one node with node number 0 and mount it there first. I used o2cb stack.
> Could you show more test info/steps. I can't follow your meaning.
> How to set up a node with a fix node number?
> With my understanding, under pcmk env, the first mounted node will auto got node
> number 1 (or any value great than 0). and there is no place to set node number
> by hand. It's very likely you mixed to use nocluster & cluster mount.
> If my suspect right (mixed mount), your use case is wrong.

Did you check my last mail? I already said i didn't do mixed mount, only 
cluster mount.

There is a configure file for o2cb, you can just set node number to 0, 
please check 
https://docs.oracle.com/en/operating-systems/oracle-linux/7/fsadmin/ol7-ocfs2.html#ol7-config-file-ocfs2

>
>>> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
>>> I want to find whether there is a idear way to fix the hung issue.
>>>
>>>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
>>>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
>>>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
>>>> [13148.749354] Call Trace:
>>>> ...
>>>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
>>>> introduced the feature to mount ocfs2 locally even it is cluster based,
>>>> that is a very dangerous, it can easily cause serious data corruption,
>>>> there is no way to stop other nodes mounting the fs and corrupting it.
>>> I can't follow your meaning. When users want to use local mount feature, they MUST know
>>> what they are doing, and how to use it.
>> I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
> yes, no one reads doc by default.
>
> currently, mount with option 'nocluster' will show special info to user:
>
> ```
> # mount -t ocfs2 -o nocluster /dev/vdd /mnt
> Warning: to mount a clustered volume without the cluster stack.
> Please make sure you only mount the file system from one node.
> Otherwise, the file system may be damaged.
> Proceed (y/N):
> ```
>
>> It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
> This feature is not enabled by default, and also shows enough info/warn before executing.
> I give (may awkward) another example:
> nocluster mount likes executing command 'rm -rf /', do you think we should
> tell/educate customer do not execute it?

That's totally out of domain of ocfs2, it's not ocfs2 developer's job to 
tell customer not doing that.

Here you provided a ocfs2 feature that can easily corrupt ocfs2.

As an ocfs2 developer, you should make sure ocfs2 was not corrupted even 
customer did something bad. That's why mkfs.ocfs2/fsck.ocfs2 check 
whether ocfs2 volume is mounted in the cluster before changing anything.

>
> The nocluster mount feature was designed to resolve customer pain point from real world:
> SUSE HA stack uses pacemaker+corosync+fsdlm+ocfs2, which complicates/inconveniences
> to set up. and need to install dozens of related packages.
>
> The nocluster feature main use case:
> customer wants to avoid to set up HA stack, but they wants to check ocfs2 volume
> or do backup volume.
That doesn't mean you have to do this in kernel. Customer had a pain to 
setup HA stack, you should develop some script/app to make it easy.
>
> In my opinion, we should make ocfs2 more powerful and include more useful features for users.
> If there are some problems related new feature, we should do our best to fix it not revert it.

Only good/safe features, i don't think this one is qualified. Also no 
one give a reviewed-by to this commit, i am not sure how it was merged.

Joseph, what's your call on this?

Thanks,

Junxiao.

>>>   From mount.ocfs2 (8), there also writes *only* mount fs on *one* node at the same time.
>>> And also tell user fs will be damaged under wrong action.
>>>
>>> ```
>>> nocluster
>>>
>>>    This  option  allows  users  to  mount a clustered volume without configuring the cluster
>>>    stack.  However, you must be aware that you can only mount the file system from one  node
>>>    at the same time, otherwise, the file system may be damaged. Please use it with caution.
>>> ```
>>>
>>>> Setup ha or other cluster-aware stack is just the cost that we have to
>>>> take for avoiding corruption, otherwise we have to do it in kernel.
>>> It's a little bit serious to totally revert this commit just under lacking sanity
>>> check. If you or maintainer think the local mount should do more jobs to prevent mix
>>> local-mount and clustered-mount scenario, we could add more sanity check during
>>> local mounting.
>> I don’t think this should be done in kernel. Setup cluster stack is the way to forward.
>>
> my mistake: all above 'local mount' should be 'nocluster mount'.
>
> At last, let's totally understand your use case (or reproduce your hung issue).
>
> Thanks,
> Heming
>

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
  2022-06-06  2:08 heming.zhao--- via Ocfs2-devel
@ 2022-06-06  8:27 ` Heming Zhao via Ocfs2-devel
  2022-06-06 16:15 ` Junxiao Bi via Ocfs2-devel
  1 sibling, 0 replies; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-06-06  8:27 UTC (permalink / raw)
  To: Junxiao Bi, ocfs2-devel

On Mon, Jun 06, 2022 at 10:08:53AM +0800, heming.zhao--- via Ocfs2-devel wrote:
> Hello Junxiao,
> 
> First of all, let's turn to the same channel to discuss your patch.
> There are two features: 'local mount' & 'nocluster mount'.
> I mistakenly wrote local-mount on some place in previous mails.
> This patch revert commit 912f655d78c5d4, which is related with 'nocluster mount'.
> 
> 
> On 6/5/22 00:19, Junxiao Bi wrote:
> > 
> > 
> >> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
> >>
> >> Hello Junxiao,
> >>
> >>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
> >>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
> >>> This commit introduced a regression that can cause mount hung.
> >>> The changes in __ocfs2_find_empty_slot causes that any node with
> >>> none-zero node number can grab the slot that was already taken by
> >>> node 0, so node 1 will access the same journal with node 0, when it
> >>> try to grab journal cluster lock, it will hung because it was already
> >>> acquired by node 0.
> >>> It's very easy to reproduce this, in one cluster, mount node 0 first,
> >>> then node 1, you will see the following call trace from node 1.
> >>
> >>  From your description, it looks your env mixed local-mount & clustered-mount.
> > No, only cluster mount.
> >>
> >> Could you mind to share your test/reproducible steps.
> >> And which ha stack do you use, pmck or o2cb?
> >>
> >> I failed to reproduce it, my test steps (with pcmk stack):
> >> ```
> >> node1:
> >> mount -t ocfs2 /dev/vdd /mnt
> >>
> >> node2:
> >> for i in {1..100}; do
> >> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
> >> sleep 3;
> >> echo "umount"; umount /mnt;
> >> done
> >> ```
> >>
> > Try set one node with node number 0 and mount it there first. I used o2cb stack.
> 
> Could you show more test info/steps. I can't follow your meaning.
> How to set up a node with a fix node number?
> With my understanding, under pcmk env, the first mounted node will auto got node
> number 1 (or any value great than 0). and there is no place to set node number
> by hand. It's very likely you mixed to use nocluster & cluster mount.
> If my suspect right (mixed mount), your use case is wrong.
> 
> >> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
> >> I want to find whether there is a idear way to fix the hung issue.
> >>
> >>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
> >>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
> >>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> >>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
> >>> [13148.749354] Call Trace:
> >>> ...
> >>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
> >>> introduced the feature to mount ocfs2 locally even it is cluster based,
> >>> that is a very dangerous, it can easily cause serious data corruption,
> >>> there is no way to stop other nodes mounting the fs and corrupting it.
> >>
> >> I can't follow your meaning. When users want to use local mount feature, they MUST know
> >> what they are doing, and how to use it.
> > I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.
> 
> yes, no one reads doc by default.
> 
> currently, mount with option 'nocluster' will show special info to user:
> 
> ```
> # mount -t ocfs2 -o nocluster /dev/vdd /mnt
> Warning: to mount a clustered volume without the cluster stack.
> Please make sure you only mount the file system from one node.
> Otherwise, the file system may be damaged.
> Proceed (y/N):
> ```
> 
> > It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.
> 
> This feature is not enabled by default, and also shows enough info/warn before executing.
> I give (may awkward) another example:
> nocluster mount likes executing command 'rm -rf /', do you think we should
> tell/educate customer do not execute it?
> 
> The nocluster mount feature was designed to resolve customer pain point from real world:
> SUSE HA stack uses pacemaker+corosync+fsdlm+ocfs2, which complicates/inconveniences
> to set up. and need to install dozens of related packages.
> 
> The nocluster feature main use case:
> customer wants to avoid to set up HA stack, but they wants to check ocfs2 volume
> or do backup volume.
> 
> In my opinion, we should make ocfs2 more powerful and include more useful features for users.
> If there are some problems related new feature, we should do our best to fix it not revert it.

I am not familiar with o2cb stack. If o2cb could give a node with node
number ZERO, I have an idea to avoid mixed noncluster & cluster mounting.

there is slot map management struct:

struct ocfs2_extended_slot {
/*00*/	__u8	es_valid;
	__u8	es_reserved1[3];
	__le32	es_node_num;
/*08*/
};

we could use the es_reserved1[0] to give nocluster mount a speical flag.
maybe we could define:

#define OCFS2_NOCLUSTER_MOUNT 1
if (XX->es_reserved1[0] == OCFS2_NOCLUSTER_MOUNT)
	this_slot_is_mounted_by_noclustered_mode;

the code logic:
- When nocluster mount, check the es_valid for existing clustered mount,
  check the es_reserved1[0] for existing noclustered mount.
  If no other nodes mounted, do the noclustered mount, and set es_reserved1[0]
  with OCFS2_NOCLUSTER_MOUNT. Then clear this value when unmount.
- When another node prepares to mount with clustered mode, it should check
  es_reserved1[0] for detecting noclustered mount. ocfs2 should block the mount
  action if any slot is marked with OCFS2_NOCLUSTER_MOUNT. (make noclustered
  mount unique)

Thanks,
Heming

> 
> >>
> >>  From mount.ocfs2 (8), there also writes *only* mount fs on *one* node at the same time.
> >> And also tell user fs will be damaged under wrong action.
> >>
> >> ```
> >> nocluster
> >>
> >>   This  option  allows  users  to  mount a clustered volume without configuring the cluster
> >>   stack.  However, you must be aware that you can only mount the file system from one  node
> >>   at the same time, otherwise, the file system may be damaged. Please use it with caution.
> >> ```
> >>
> >>> Setup ha or other cluster-aware stack is just the cost that we have to
> >>> take for avoiding corruption, otherwise we have to do it in kernel.
> >>
> >> It's a little bit serious to totally revert this commit just under lacking sanity
> >> check. If you or maintainer think the local mount should do more jobs to prevent mix
> >> local-mount and clustered-mount scenario, we could add more sanity check during
> >> local mounting.
> > I don’t think this should be done in kernel. Setup cluster stack is the way to forward.
> > 
> 
> my mistake: all above 'local mount' should be 'nocluster mount'.
> 
> At last, let's totally understand your use case (or reproduce your hung issue).
> 
> Thanks,
> Heming
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack"
@ 2022-06-06  2:08 heming.zhao--- via Ocfs2-devel
  2022-06-06  8:27 ` Heming Zhao via Ocfs2-devel
  2022-06-06 16:15 ` Junxiao Bi via Ocfs2-devel
  0 siblings, 2 replies; 15+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-06-06  2:08 UTC (permalink / raw)
  To: Junxiao Bi; +Cc: ocfs2-devel

Hello Junxiao,

First of all, let's turn to the same channel to discuss your patch.
There are two features: 'local mount' & 'nocluster mount'.
I mistakenly wrote local-mount on some place in previous mails.
This patch revert commit 912f655d78c5d4, which is related with 'nocluster mount'.


On 6/5/22 00:19, Junxiao Bi wrote:
> 
> 
>> 在 2022年6月4日,上午1:45,heming.zhao@suse.com 写道:
>>
>> Hello Junxiao,
>>
>>> On 6/4/22 06:28, Junxiao Bi via Ocfs2-devel wrote:
>>> This reverts commit 912f655d78c5d4ad05eac287f23a435924df7144.
>>> This commit introduced a regression that can cause mount hung.
>>> The changes in __ocfs2_find_empty_slot causes that any node with
>>> none-zero node number can grab the slot that was already taken by
>>> node 0, so node 1 will access the same journal with node 0, when it
>>> try to grab journal cluster lock, it will hung because it was already
>>> acquired by node 0.
>>> It's very easy to reproduce this, in one cluster, mount node 0 first,
>>> then node 1, you will see the following call trace from node 1.
>>
>>  From your description, it looks your env mixed local-mount & clustered-mount.
> No, only cluster mount.
>>
>> Could you mind to share your test/reproducible steps.
>> And which ha stack do you use, pmck or o2cb?
>>
>> I failed to reproduce it, my test steps (with pcmk stack):
>> ```
>> node1:
>> mount -t ocfs2 /dev/vdd /mnt
>>
>> node2:
>> for i in {1..100}; do
>> echo "mount <$i>"; mount -t ocfs2 /dev/vdd /mnt;
>> sleep 3;
>> echo "umount"; umount /mnt;
>> done
>> ```
>>
> Try set one node with node number 0 and mount it there first. I used o2cb stack.

Could you show more test info/steps. I can't follow your meaning.
How to set up a node with a fix node number?
With my understanding, under pcmk env, the first mounted node will auto got node
number 1 (or any value great than 0). and there is no place to set node number
by hand. It's very likely you mixed to use nocluster & cluster mount.
If my suspect right (mixed mount), your use case is wrong.

>> This local mount feature helps SUSE customers to maintain ocfs2 partition, it's useful.
>> I want to find whether there is a idear way to fix the hung issue.
>>
>>> [13148.735424] INFO: task mount.ocfs2:53045 blocked for more than 122 seconds.
>>> [13148.739691]       Not tainted 5.15.0-2148.0.4.el8uek.mountracev2.x86_64 #2
>>> [13148.742560] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>> [13148.745846] task:mount.ocfs2     state:D stack:    0 pid:53045 ppid: 53044 flags:0x00004000
>>> [13148.749354] Call Trace:
>>> ...
>>> To fix it, we can just fix __ocfs2_find_empty_slot. But original commit
>>> introduced the feature to mount ocfs2 locally even it is cluster based,
>>> that is a very dangerous, it can easily cause serious data corruption,
>>> there is no way to stop other nodes mounting the fs and corrupting it.
>>
>> I can't follow your meaning. When users want to use local mount feature, they MUST know
>> what they are doing, and how to use it.
> I can’t agree with you. There is no  mechanism to make sure customer will follow that, you can’t expect customer understand tech well or even read the doc.

yes, no one reads doc by default.

currently, mount with option 'nocluster' will show special info to user:

```
# mount -t ocfs2 -o nocluster /dev/vdd /mnt
Warning: to mount a clustered volume without the cluster stack.
Please make sure you only mount the file system from one node.
Otherwise, the file system may be damaged.
Proceed (y/N):
```

> It’s not the case that you don’t have choice, setup cluster stack is the way to stop customer doing something bad, I believe you have to educate customer to understand this is the cost to guard data security, otherwise when something bad happens, they will lose important data, maybe even no way to recover.

This feature is not enabled by default, and also shows enough info/warn before executing.
I give (may awkward) another example:
nocluster mount likes executing command 'rm -rf /', do you think we should
tell/educate customer do not execute it?

The nocluster mount feature was designed to resolve customer pain point from real world:
SUSE HA stack uses pacemaker+corosync+fsdlm+ocfs2, which complicates/inconveniences
to set up. and need to install dozens of related packages.

The nocluster feature main use case:
customer wants to avoid to set up HA stack, but they wants to check ocfs2 volume
or do backup volume.

In my opinion, we should make ocfs2 more powerful and include more useful features for users.
If there are some problems related new feature, we should do our best to fix it not revert it.

>>
>>  From mount.ocfs2 (8), there also writes *only* mount fs on *one* node at the same time.
>> And also tell user fs will be damaged under wrong action.
>>
>> ```
>> nocluster
>>
>>   This  option  allows  users  to  mount a clustered volume without configuring the cluster
>>   stack.  However, you must be aware that you can only mount the file system from one  node
>>   at the same time, otherwise, the file system may be damaged. Please use it with caution.
>> ```
>>
>>> Setup ha or other cluster-aware stack is just the cost that we have to
>>> take for avoiding corruption, otherwise we have to do it in kernel.
>>
>> It's a little bit serious to totally revert this commit just under lacking sanity
>> check. If you or maintainer think the local mount should do more jobs to prevent mix
>> local-mount and clustered-mount scenario, we could add more sanity check during
>> local mounting.
> I don’t think this should be done in kernel. Setup cluster stack is the way to forward.
> 

my mistake: all above 'local mount' should be 'nocluster mount'.

At last, let's totally understand your use case (or reproduce your hung issue).

Thanks,
Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-06-25 13:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-03 22:28 [Ocfs2-devel] [PATCH] Revert "ocfs2: mount shared volume without ha stack" Junxiao Bi via Ocfs2-devel
2022-06-04  8:45 ` heming.zhao--- via Ocfs2-devel
2022-06-04 16:19   ` Junxiao Bi via Ocfs2-devel
2022-06-25 13:30     ` Joseph Qi via Ocfs2-devel
2022-06-06  2:08 heming.zhao--- via Ocfs2-devel
2022-06-06  8:27 ` Heming Zhao via Ocfs2-devel
2022-06-06 16:15 ` Junxiao Bi via Ocfs2-devel
2022-06-06 23:50   ` heming.zhao--- via Ocfs2-devel
2022-06-07  0:32     ` Heming Zhao via Ocfs2-devel
2022-06-07  2:07     ` Joseph Qi via Ocfs2-devel
2022-06-07  2:21       ` Junxiao Bi via Ocfs2-devel
2022-06-07  2:38         ` heming.zhao--- via Ocfs2-devel
2022-06-07  3:06           ` Junxiao Bi via Ocfs2-devel
2022-06-07  6:31             ` Joseph Qi via Ocfs2-devel
2022-06-07  9:42               ` Heming Zhao via Ocfs2-devel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).