[Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage
@ 2022-04-08 10:30 Heming Zhao via Ocfs2-devel
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

draft -> v1: 
- split one patch into 5 patches.
- goto labal name change to out_xxx style.
- only test for mount/umount & 0001-xx.patch related issue.
- polish some codes

note:
I didn't totally revert da5e7c87827e8, just revert init part codes.

Heming Zhao (5):
  ocfs2: partly revert da5e7c87827e8 for mounting crash issue
  ocfs2: change return type of ocfs2_resmap_init
  ocfs2: ocfs2_initialize_super does cleanup job before return error
  ocfs2: ocfs2_mount_volume does cleanup job before return error
  ocfs2: rewrite error handling of ocfs2_fill_super

 fs/ocfs2/inode.c        |   4 +-
 fs/ocfs2/journal.c      |  19 ----
 fs/ocfs2/reservations.c |   4 +-
 fs/ocfs2/reservations.h |   5 +-
 fs/ocfs2/super.c        | 195 ++++++++++++++++++++++++++--------------
 5 files changed, 132 insertions(+), 95 deletions(-)

-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue
  2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
  2022-04-09 13:11   ` Joseph Qi via Ocfs2-devel
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
journal init later than before, it makes NULL pointer access in free
routine.

Crash flow:

ocfs2_fill_super
 + ocfs2_mount_volume
 |  + ocfs2_dlm_init //fail & return, osb->journal is NULL.
 |  + ...
 |  + ocfs2_check_volume //no chance to init osb->journal
 |
 + ...
 + ocfs2_dismount_volume
    ocfs2_release_system_inodes
      ...
       evict
        ...
         ocfs2_clear_inode
          ocfs2_checkpoint_inode
           ocfs2_ci_fully_checkpointed
            time_after(journal->j_trans_id, ci->ci_last_trans)
             + journal is empty, crash!

For fixing, there are three solutions:

1> Revert commit da5e7c87827e8

For avoiding kernel crash, this make sense for us. We only concerned
whether there has any non-system inode access before dlm init. The
answer is NO. And all journal replay/recovery handling happen after
dlm & journal init done. So this method is not graceful but workable.

2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)

The fix code is special for mounting phase, but it will continue
working after mounting stage. In another word, this method adds useless
code in normal inode free flow.

3> Do directly free inode in mounting phase

This method is brutal/complex and may introduce unsafe code, currently
maintainer didn't like.

At last, we chose method <1> and partly reverted commit da5e7c87827e8.
We reverted journal init code, and kept cleanup code flow.

Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/inode.c   |  4 ++--
 fs/ocfs2/journal.c | 21 +--------------------
 fs/ocfs2/super.c   | 33 +++++++++++++++++++++++++++++++++
 3 files changed, 36 insertions(+), 22 deletions(-)

diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 5739dc301569..bb116c39b581 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 	struct inode *inode = NULL;
 	struct super_block *sb = osb->sb;
 	struct ocfs2_find_inode_args args;
+	journal_t *journal = osb->journal->j_journal;
 
 	trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
 			       sysfile_type);
@@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 	 * part of the transaction - the inode could have been reclaimed and
 	 * now it is reread from disk.
 	 */
-	if (osb->journal) {
+	if (journal) {
 		transaction_t *transaction;
 		tid_t tid;
 		struct ocfs2_inode_info *oi = OCFS2_I(inode);
-		journal_t *journal = osb->journal->j_journal;
 
 		read_lock(&journal->j_state_lock);
 		if (journal->j_running_transaction)
diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 1887a2708709..afb85de3bb60 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -815,30 +815,11 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
 	int status = -1;
 	struct inode *inode = NULL; /* the journal inode */
 	journal_t *j_journal = NULL;
-	struct ocfs2_journal *journal = NULL;
+	struct ocfs2_journal *journal = osb->journal;
 	struct ocfs2_dinode *di = NULL;
 	struct buffer_head *bh = NULL;
 	int inode_lock = 0;
 
-	/* initialize our journal structure */
-	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
-	if (!journal) {
-		mlog(ML_ERROR, "unable to alloc journal\n");
-		status = -ENOMEM;
-		goto done;
-	}
-	osb->journal = journal;
-	journal->j_osb = osb;
-
-	atomic_set(&journal->j_num_trans, 0);
-	init_rwsem(&journal->j_trans_barrier);
-	init_waitqueue_head(&journal->j_checkpointed);
-	spin_lock_init(&journal->j_lock);
-	journal->j_trans_id = 1UL;
-	INIT_LIST_HEAD(&journal->j_la_cleanups);
-	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
-	journal->j_state = OCFS2_JOURNAL_FREE;
-
 	/* already have the inode for our journal */
 	inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
 					    osb->slot_num);
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 477cdf94122e..5f63a2333e52 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	int i, cbits, bbits;
 	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
 	struct inode *inode = NULL;
+	struct ocfs2_journal *journal;
 	struct ocfs2_super *osb;
 	u64 total_blocks;
 
@@ -2195,6 +2196,32 @@ static int ocfs2_initialize_super(struct super_block *sb,
 
 	get_random_bytes(&osb->s_next_generation, sizeof(u32));
 
+	/* FIXME
+	 * This should be done in ocfs2_journal_init(), but unknown
+	 * ordering issues will cause the filesystem to crash.
+	 * If anyone wants to figure out what part of the code
+	 * refers to osb->journal before ocfs2_journal_init() is run,
+	 * be my guest.
+	 */
+	/* initialize our journal structure */
+	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
+	if (!journal) {
+		mlog(ML_ERROR, "unable to alloc journal\n");
+		status = -ENOMEM;
+		goto bail;
+	}
+	osb->journal = journal;
+	journal->j_osb = osb;
+
+	atomic_set(&journal->j_num_trans, 0);
+	init_rwsem(&journal->j_trans_barrier);
+	init_waitqueue_head(&journal->j_checkpointed);
+	spin_lock_init(&journal->j_lock);
+	journal->j_trans_id = 1UL;
+	INIT_LIST_HEAD(&journal->j_la_cleanups);
+	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
+	journal->j_state = OCFS2_JOURNAL_FREE;
+
 	INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
 	init_llist_head(&osb->dquot_drop_list);
 
@@ -2483,6 +2510,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
 
 	kfree(osb->osb_orphan_wipes);
 	kfree(osb->slot_recovery_generations);
+	/* FIXME
+	 * This belongs in journal shutdown, but because we have to
+	 * allocate osb->journal at the middle of ocfs2_initialize_super(),
+	 * we free it here.
+	 */
+	kfree(osb->journal);
 	kfree(osb->local_alloc_copy);
 	kfree(osb->uuid_str);
 	kfree(osb->vol_label);
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init
  2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
  2022-04-09 13:14   ` Joseph Qi via Ocfs2-devel
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

For rewrite error handling of ocfs2_initialize_super, change this
function return type from int to void.

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/reservations.c | 4 +---
 fs/ocfs2/reservations.h | 5 ++---
 fs/ocfs2/super.c        | 6 +-----
 3 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/fs/ocfs2/reservations.c b/fs/ocfs2/reservations.c
index 769e466887b0..a9d1296d736d 100644
--- a/fs/ocfs2/reservations.c
+++ b/fs/ocfs2/reservations.c
@@ -198,7 +198,7 @@ void ocfs2_resv_set_type(struct ocfs2_alloc_reservation *resv,
 	resv->r_flags |= flags;
 }
 
-int ocfs2_resmap_init(struct ocfs2_super *osb,
+void ocfs2_resmap_init(struct ocfs2_super *osb,
 		      struct ocfs2_reservation_map *resmap)
 {
 	memset(resmap, 0, sizeof(*resmap));
@@ -207,8 +207,6 @@ int ocfs2_resmap_init(struct ocfs2_super *osb,
 	resmap->m_reservations = RB_ROOT;
 	/* m_bitmap_len is initialized to zero by the above memset. */
 	INIT_LIST_HEAD(&resmap->m_lru);
-
-	return 0;
 }
 
 static void ocfs2_resv_mark_lru(struct ocfs2_reservation_map *resmap,
diff --git a/fs/ocfs2/reservations.h b/fs/ocfs2/reservations.h
index 677c50663595..532d42eb6db7 100644
--- a/fs/ocfs2/reservations.h
+++ b/fs/ocfs2/reservations.h
@@ -78,10 +78,9 @@ void ocfs2_resv_discard(struct ocfs2_reservation_map *resmap,
  * @ops: unused for now
  * @max_bitmap_bytes: Maximum size of the bitmap (typically blocksize)
  *
- * Only possible return value other than '0' is -ENOMEM for failure to
- * allocation mirror bitmap.
+ * This function does initialize resmap job.
  */
-int ocfs2_resmap_init(struct ocfs2_super *osb,
+void ocfs2_resmap_init(struct ocfs2_super *osb,
 		      struct ocfs2_reservation_map *resmap);
 
 /**
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 8391592817f8..f91c5510bc7e 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2111,11 +2111,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 
 	init_waitqueue_head(&osb->osb_mount_event);
 
-	status = ocfs2_resmap_init(osb, &osb->osb_la_resmap);
-	if (status) {
-		mlog_errno(status);
-		goto bail;
-	}
+	ocfs2_resmap_init(osb, &osb->osb_la_resmap);
 
 	osb->vol_label = kmalloc(OCFS2_MAX_VOL_LABEL_LEN, GFP_KERNEL);
 	if (!osb->vol_label) {
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
  2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
  2022-04-09 13:30   ` Joseph Qi via Ocfs2-devel
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
  4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_initialize_super.

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 40 insertions(+), 18 deletions(-)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index f91c5510bc7e..8443ba031dec 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb) {
 		status = -ENOMEM;
 		mlog_errno(status);
-		goto bail;
+		goto out;
 	}
 
 	sb->s_fs_info = osb;
@@ -2084,7 +2084,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		mlog(ML_ERROR, "Invalid number of node slots (%u)\n",
 		     osb->max_slots);
 		status = -EINVAL;
-		goto bail;
+		goto out;
 	}
 
 	ocfs2_orphan_scan_init(osb);
@@ -2093,7 +2093,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (status) {
 		mlog(ML_ERROR, "Unable to initialize recovery state\n");
 		mlog_errno(status);
-		goto bail;
+		goto out;
 	}
 
 	init_waitqueue_head(&osb->checkpoint_event);
@@ -2117,7 +2117,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb->vol_label) {
 		mlog(ML_ERROR, "unable to alloc vol label\n");
 		status = -ENOMEM;
-		goto bail;
+		goto out_recovery_map;
 	}
 
 	osb->slot_recovery_generations =
@@ -2126,7 +2126,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb->slot_recovery_generations) {
 		status = -ENOMEM;
 		mlog_errno(status);
-		goto bail;
+		goto out_vol_label;
 	}
 
 	init_waitqueue_head(&osb->osb_wipe_event);
@@ -2136,7 +2136,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb->osb_orphan_wipes) {
 		status = -ENOMEM;
 		mlog_errno(status);
-		goto bail;
+		goto out_slot_recovery_gen;
 	}
 
 	osb->osb_rf_lock_tree = RB_ROOT;
@@ -2152,13 +2152,13 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		mlog(ML_ERROR, "couldn't mount because of unsupported "
 		     "optional features (%x).\n", i);
 		status = -EINVAL;
-		goto bail;
+		goto out_orphan_wipes;
 	}
 	if (!sb_rdonly(osb->sb) && (i = OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb, ~OCFS2_FEATURE_RO_COMPAT_SUPP))) {
 		mlog(ML_ERROR, "couldn't mount RDWR because of "
 		     "unsupported optional features (%x).\n", i);
 		status = -EINVAL;
-		goto bail;
+		goto out_orphan_wipes;
 	}
 
 	if (ocfs2_clusterinfo_valid(osb)) {
@@ -2179,7 +2179,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 			     "cluster stack label (%s) \n",
 			     osb->osb_cluster_stack);
 			status = -EINVAL;
-			goto bail;
+			goto out_orphan_wipes;
 		}
 		memcpy(osb->osb_cluster_name,
 			OCFS2_RAW_SB(di)->s_cluster_info.ci_cluster,
@@ -2204,7 +2204,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!journal) {
 		mlog(ML_ERROR, "unable to alloc journal\n");
 		status = -ENOMEM;
-		goto bail;
+		goto out_orphan_wipes;
 	}
 	osb->journal = journal;
 	journal->j_osb = osb;
@@ -2231,7 +2231,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		mlog(ML_ERROR, "Volume has invalid cluster size (%d)\n",
 		     osb->s_clustersize);
 		status = -EINVAL;
-		goto bail;
+		goto out_journal;
 	}
 
 	total_blocks = ocfs2_clusters_to_blocks(osb->sb,
@@ -2243,14 +2243,14 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		mlog(ML_ERROR, "Volume too large "
 		     "to mount safely on this system");
 		status = -EFBIG;
-		goto bail;
+		goto out_journal;
 	}
 
 	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
 				 sizeof(di->id2.i_super.s_uuid))) {
 		mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
 		status = -ENOMEM;
-		goto bail;
+		goto out_journal;
 	}
 
 	strlcpy(osb->vol_label, di->id2.i_super.s_label,
@@ -2270,7 +2270,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb->osb_dlm_debug) {
 		status = -ENOMEM;
 		mlog_errno(status);
-		goto bail;
+		goto out_uuid_str;
 	}
 
 	atomic_set(&osb->vol_state, VOLUME_INIT);
@@ -2279,7 +2279,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	status = ocfs2_init_global_system_inodes(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto bail;
+		goto out_dlm_out;
 	}
 
 	/*
@@ -2290,7 +2290,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!inode) {
 		status = -EINVAL;
 		mlog_errno(status);
-		goto bail;
+		goto out_system_inodes;
 	}
 
 	osb->bitmap_blkno = OCFS2_I(inode)->ip_blkno;
@@ -2303,16 +2303,38 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	status = ocfs2_init_slot_info(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto bail;
+		goto out_system_inodes;
 	}
 
 	osb->ocfs2_wq = alloc_ordered_workqueue("ocfs2_wq", WQ_MEM_RECLAIM);
 	if (!osb->ocfs2_wq) {
 		status = -ENOMEM;
 		mlog_errno(status);
+		goto out_slot_info;
 	}
 
-bail:
+	return status;
+
+out_slot_info:
+	ocfs2_free_slot_info(osb);
+out_system_inodes:
+	ocfs2_release_system_inodes(osb);
+out_dlm_out:
+	ocfs2_put_dlm_debug(osb->osb_dlm_debug);
+out_uuid_str:
+	kfree(osb->uuid_str);
+out_journal:
+	kfree(osb->journal);
+out_orphan_wipes:
+	kfree(osb->osb_orphan_wipes);
+out_slot_recovery_gen:
+	kfree(osb->slot_recovery_generations);
+out_vol_label:
+	kfree(osb->vol_label);
+out_recovery_map:
+	kfree(osb->recovery_map);
+out:
+	kfree(osb);
 	return status;
 }
 
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
  2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
                   ` (2 preceding siblings ...)
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
  2022-04-09 13:46   ` Joseph Qi via Ocfs2-devel
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
  4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_mount_volume.

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 8443ba031dec..d4b7a29cb364 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
 static int ocfs2_mount_volume(struct super_block *sb)
 {
 	int status = 0;
-	int unlock_super = 0;
 	struct ocfs2_super *osb = OCFS2_SB(sb);
 
 	if (ocfs2_is_hard_readonly(osb))
-		goto leave;
+		goto out;
 
 	mutex_init(&osb->obs_trim_fs_mutex);
 
@@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
 		if (status == -EBADR && ocfs2_userspace_stack(osb))
 			mlog(ML_ERROR, "couldn't mount because cluster name on"
 			" disk does not match the running cluster name.\n");
-		goto leave;
+		goto out;
 	}
 
 	status = ocfs2_super_lock(osb, 1);
 	if (status < 0) {
 		mlog_errno(status);
-		goto leave;
+		goto out_dlm;
 	}
-	unlock_super = 1;
 
 	/* This will load up the node map and add ourselves to it. */
 	status = ocfs2_find_slot(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto leave;
+		goto out_super_lock;
 	}
 
 	/* load all node-local system inodes */
 	status = ocfs2_init_local_system_inodes(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto leave;
+		goto out_super_lock;
 	}
 
 	status = ocfs2_check_volume(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto leave;
+		goto out_system_inodes;
 	}
 
 	status = ocfs2_truncate_log_init(osb);
-	if (status < 0)
+	if (status < 0) {
 		mlog_errno(status);
+		goto out_journal_shutdown;
+	}
 
-leave:
-	if (unlock_super)
-		ocfs2_super_unlock(osb, 1);
+	ocfs2_super_unlock(osb, 1);
+	return status;
 
+out_journal_shutdown:
+	ocfs2_journal_shutdown(osb);
+out_system_inodes:
+	ocfs2_release_system_inodes(osb);
+out_super_lock:
+	ocfs2_super_unlock(osb, 1);
+out_dlm:
+	ocfs2_dlm_shutdown(osb, 0);
+out:
 	return status;
 }
 
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
  2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
                   ` (3 preceding siblings ...)
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
  2022-04-09 14:02   ` Joseph Qi via Ocfs2-devel
  4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

current ocfs2_fill_super error handling is very simple, and includes
kernel crash possibility.

e.g.

ocfs2_fill_super
  ocfs2_initialize_super // fails in check max_slots & return

Then osb->osb_mount_event is not properly initialized, and then will
crash at wake_up(&osb->osb_mount_event) in goto label read_super_error.

For fixing the issue and enhancing error handling, we rewrite error
handling of ocfs2_fill_super.

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/super.c | 67 ++++++++++++++++++++++++------------------------
 1 file changed, 33 insertions(+), 34 deletions(-)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index d4b7a29cb364..c3efd9ad49ac 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 
 	if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
 		status = -EINVAL;
-		goto read_super_error;
+		goto out;
 	}
 
 	/* probe for superblock */
 	status = ocfs2_sb_probe(sb, &bh, &sector_size, &stats);
 	if (status < 0) {
 		mlog(ML_ERROR, "superblock probe failed!\n");
-		goto read_super_error;
+		goto out;
 	}
 
 	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
-	osb = OCFS2_SB(sb);
-	if (status < 0) {
-		mlog_errno(status);
-		goto read_super_error;
-	}
 	brelse(bh);
 	bh = NULL;
+	if (status < 0)
+		goto out;
+
+	osb = OCFS2_SB(sb);
 
 	if (!ocfs2_check_set_options(sb, &parsed_options)) {
 		status = -EINVAL;
-		goto read_super_error;
+		goto out_super;
 	}
 	osb->s_mount_opt = parsed_options.mount_opt;
 	osb->s_atime_quantum = parsed_options.atime_quantum;
@@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 
 	status = ocfs2_verify_userspace_stack(osb, &parsed_options);
 	if (status)
-		goto read_super_error;
+		goto out_super;
 
 	sb->s_magic = OCFS2_SUPER_MAGIC;
 
@@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 			status = -EACCES;
 			mlog(ML_ERROR, "Readonly device detected but readonly "
 			     "mount was not specified.\n");
-			goto read_super_error;
+			goto out_super;
 		}
 
 		/* You should not be able to start a local heartbeat
@@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 			status = -EROFS;
 			mlog(ML_ERROR, "Local heartbeat specified on readonly "
 			     "device.\n");
-			goto read_super_error;
+			goto out_super;
 		}
 
 		status = ocfs2_check_journals_nolocks(osb);
@@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 				mlog(ML_ERROR, "Recovery required on readonly "
 				     "file system, but write access is "
 				     "unavailable.\n");
-			else
-				mlog_errno(status);
-			goto read_super_error;
+			goto out_super;
 		}
 
 		ocfs2_set_ro_flag(osb, 1);
@@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 	}
 
 	status = ocfs2_verify_heartbeat(osb);
-	if (status < 0) {
-		mlog_errno(status);
-		goto read_super_error;
-	}
+	if (status < 0)
+		goto out_super;
 
 	osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
 						 ocfs2_debugfs_root);
@@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 
 	status = ocfs2_mount_volume(sb);
 	if (status < 0)
-		goto read_super_error;
+		goto out_debugfs;
 
 	if (osb->root_inode)
 		inode = igrab(osb->root_inode);
 
 	if (!inode) {
 		status = -EIO;
-		mlog_errno(status);
-		goto read_super_error;
+		goto out_journal;
 	}
 
 	osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
@@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 	if (!osb->osb_dev_kset) {
 		status = -ENOMEM;
 		mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
-		goto read_super_error;
+		goto out_journal;
 	}
 
 	/* Create filecheck sysfs related directories/files at
@@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 		status = -ENOMEM;
 		mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
 			"/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
-		goto read_super_error;
+		goto out_journal;
 	}
 
 	root = d_make_root(inode);
 	if (!root) {
 		status = -ENOMEM;
-		mlog_errno(status);
-		goto read_super_error;
+		goto out_journal;
 	}
 
 	sb->s_root = root;
@@ -1178,17 +1171,23 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 
 	return status;
 
-read_super_error:
-	brelse(bh);
+out_journal:
+	mlog_errno(status);
+	atomic_set(&osb->vol_state, VOLUME_DISABLED);
+	wake_up(&osb->osb_mount_event);
+	ocfs2_dismount_volume(sb, 1);
 
-	if (status)
-		mlog_errno(status);
+	return status;
 
-	if (osb) {
-		atomic_set(&osb->vol_state, VOLUME_DISABLED);
-		wake_up(&osb->osb_mount_event);
-		ocfs2_dismount_volume(sb, 1);
-	}
+out_debugfs:
+	debugfs_remove_recursive(osb->osb_debug_root);
+out_super:
+	ocfs2_release_system_inodes(osb);
+	kfree(osb->recovery_map);
+	ocfs2_delete_osb(osb);
+	kfree(osb);
+out:
+	mlog_errno(status);
 
 	return status;
 }
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
@ 2022-04-09 13:11   ` Joseph Qi via Ocfs2-devel
  2022-04-09 16:28     ` heming.zhao--- via Ocfs2-devel
  0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 13:11 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel

Suggest rename the subject, something like
ocfs2: fix mount crash if journal is not alloced

On 4/8/22 6:30 PM, Heming Zhao wrote:
> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
> journal init later than before, it makes NULL pointer access in free
> routine.
> 
> Crash flow:
> 
> ocfs2_fill_super
>  + ocfs2_mount_volume
>  |  + ocfs2_dlm_init //fail & return, osb->journal is NULL.
>  |  + ...
>  |  + ocfs2_check_volume //no chance to init osb->journal
>  |
>  + ...
>  + ocfs2_dismount_volume
>     ocfs2_release_system_inodes
>       ...
>        evict
>         ...
>          ocfs2_clear_inode
>           ocfs2_checkpoint_inode
>            ocfs2_ci_fully_checkpointed
>             time_after(journal->j_trans_id, ci->ci_last_trans)
>              + journal is empty, crash!
> 
> For fixing, there are three solutions:
> 
> 1> Revert commit da5e7c87827e8
> 
> For avoiding kernel crash, this make sense for us. We only concerned
> whether there has any non-system inode access before dlm init. The
> answer is NO. And all journal replay/recovery handling happen after
> dlm & journal init done. So this method is not graceful but workable.
> 
> 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)
> 
> The fix code is special for mounting phase, but it will continue
> working after mounting stage. In another word, this method adds useless
> code in normal inode free flow.
> 
> 3> Do directly free inode in mounting phase
> 
> This method is brutal/complex and may introduce unsafe code, currently
> maintainer didn't like.
> 
> At last, we chose method <1> and partly reverted commit da5e7c87827e8.
> We reverted journal init code, and kept cleanup code flow.
> 
> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
>  fs/ocfs2/inode.c   |  4 ++--
>  fs/ocfs2/journal.c | 21 +--------------------
>  fs/ocfs2/super.c   | 33 +++++++++++++++++++++++++++++++++
>  3 files changed, 36 insertions(+), 22 deletions(-)
> 
> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> index 5739dc301569..bb116c39b581 100644
> --- a/fs/ocfs2/inode.c
> +++ b/fs/ocfs2/inode.c
> @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>  	struct inode *inode = NULL;
>  	struct super_block *sb = osb->sb;
>  	struct ocfs2_find_inode_args args;
> +	journal_t *journal = osb->journal->j_journal;
>  
>  	trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
>  			       sysfile_type);
> @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>  	 * part of the transaction - the inode could have been reclaimed and
>  	 * now it is reread from disk.
>  	 */
> -	if (osb->journal) {
> +	if (journal) {
>  		transaction_t *transaction;
>  		tid_t tid;
>  		struct ocfs2_inode_info *oi = OCFS2_I(inode);
> -		journal_t *journal = osb->journal->j_journal;
>  
>  		read_lock(&journal->j_state_lock);
>  		if (journal->j_running_transaction)
> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
> index 1887a2708709..afb85de3bb60 100644
> --- a/fs/ocfs2/journal.c
> +++ b/fs/ocfs2/journal.c
> @@ -815,30 +815,11 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
>  	int status = -1;
>  	struct inode *inode = NULL; /* the journal inode */
>  	journal_t *j_journal = NULL;
> -	struct ocfs2_journal *journal = NULL;
> +	struct ocfs2_journal *journal = osb->journal;
>  	struct ocfs2_dinode *di = NULL;
>  	struct buffer_head *bh = NULL;
>  	int inode_lock = 0;
>  
> -	/* initialize our journal structure */
> -	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
> -	if (!journal) {
> -		mlog(ML_ERROR, "unable to alloc journal\n");
> -		status = -ENOMEM;
> -		goto done;
> -	}
> -	osb->journal = journal;
> -	journal->j_osb = osb;
> -
> -	atomic_set(&journal->j_num_trans, 0);
> -	init_rwsem(&journal->j_trans_barrier);
> -	init_waitqueue_head(&journal->j_checkpointed);
> -	spin_lock_init(&journal->j_lock);
> -	journal->j_trans_id = 1UL;
> -	INIT_LIST_HEAD(&journal->j_la_cleanups);
> -	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
> -	journal->j_state = OCFS2_JOURNAL_FREE;
> -
>  	/* already have the inode for our journal */
>  	inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
>  					    osb->slot_num);
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 477cdf94122e..5f63a2333e52 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	int i, cbits, bbits;
>  	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>  	struct inode *inode = NULL;
> +	struct ocfs2_journal *journal;
>  	struct ocfs2_super *osb;
>  	u64 total_blocks;
>  
> @@ -2195,6 +2196,32 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  
>  	get_random_bytes(&osb->s_next_generation, sizeof(u32));
>  
> +	/* FIXME
> +	 * This should be done in ocfs2_journal_init(), but unknown
> +	 * ordering issues will cause the filesystem to crash.
> +	 * If anyone wants to figure out what part of the code
> +	 * refers to osb->journal before ocfs2_journal_init() is run,
> +	 * be my guest.
> +	 */
> +	/* initialize our journal structure */
> +	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
> +	if (!journal) {
> +		mlog(ML_ERROR, "unable to alloc journal\n");
> +		status = -ENOMEM;
> +		goto bail;
> +	}
> +	osb->journal = journal;
> +	journal->j_osb = osb;
> +
> +	atomic_set(&journal->j_num_trans, 0);
> +	init_rwsem(&journal->j_trans_barrier);
> +	init_waitqueue_head(&journal->j_checkpointed);
> +	spin_lock_init(&journal->j_lock);
> +	journal->j_trans_id = 1UL;
> +	INIT_LIST_HEAD(&journal->j_la_cleanups);
> +	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
> +	journal->j_state = OCFS2_JOURNAL_FREE;
> +
We may fold these into a new function, such as ocfs2_journal_alloc().

Thanks,
Joseph

>  	INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
>  	init_llist_head(&osb->dquot_drop_list);
>  
> @@ -2483,6 +2510,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
>  
>  	kfree(osb->osb_orphan_wipes);
>  	kfree(osb->slot_recovery_generations);
> +	/* FIXME
> +	 * This belongs in journal shutdown, but because we have to
> +	 * allocate osb->journal at the middle of ocfs2_initialize_super(),
> +	 * we free it here.
> +	 */
> +	kfree(osb->journal);
>  	kfree(osb->local_alloc_copy);
>  	kfree(osb->uuid_str);
>  	kfree(osb->vol_label);

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
@ 2022-04-09 13:14   ` Joseph Qi via Ocfs2-devel
  2022-04-09 16:29     ` heming.zhao--- via Ocfs2-devel
  0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 13:14 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel



On 4/8/22 6:30 PM, Heming Zhao wrote:
> For rewrite error handling of ocfs2_initialize_super, change this
> function return type from int to void.
> 
Since ocfs2_resmap_init() always return 0, change it to void.

> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
>  fs/ocfs2/reservations.c | 4 +---
>  fs/ocfs2/reservations.h | 5 ++---
>  fs/ocfs2/super.c        | 6 +-----
>  3 files changed, 4 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/ocfs2/reservations.c b/fs/ocfs2/reservations.c
> index 769e466887b0..a9d1296d736d 100644
> --- a/fs/ocfs2/reservations.c
> +++ b/fs/ocfs2/reservations.c
> @@ -198,7 +198,7 @@ void ocfs2_resv_set_type(struct ocfs2_alloc_reservation *resv,
>  	resv->r_flags |= flags;
>  }
>  
> -int ocfs2_resmap_init(struct ocfs2_super *osb,
> +void ocfs2_resmap_init(struct ocfs2_super *osb,
>  		      struct ocfs2_reservation_map *resmap)
>  {
>  	memset(resmap, 0, sizeof(*resmap));
> @@ -207,8 +207,6 @@ int ocfs2_resmap_init(struct ocfs2_super *osb,
>  	resmap->m_reservations = RB_ROOT;
>  	/* m_bitmap_len is initialized to zero by the above memset. */
>  	INIT_LIST_HEAD(&resmap->m_lru);
> -
> -	return 0;
>  }
>  
>  static void ocfs2_resv_mark_lru(struct ocfs2_reservation_map *resmap,
> diff --git a/fs/ocfs2/reservations.h b/fs/ocfs2/reservations.h
> index 677c50663595..532d42eb6db7 100644
> --- a/fs/ocfs2/reservations.h
> +++ b/fs/ocfs2/reservations.h
> @@ -78,10 +78,9 @@ void ocfs2_resv_discard(struct ocfs2_reservation_map *resmap,
>   * @ops: unused for now
>   * @max_bitmap_bytes: Maximum size of the bitmap (typically blocksize)
>   *
> - * Only possible return value other than '0' is -ENOMEM for failure to
> - * allocation mirror bitmap.
> + * This function does initialize resmap job.
>   */
This comments is meaningless, so just remove it looks better.

Thanks,
Joseph

> -int ocfs2_resmap_init(struct ocfs2_super *osb,
> +void ocfs2_resmap_init(struct ocfs2_super *osb,
>  		      struct ocfs2_reservation_map *resmap);
>  
>  /**
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 8391592817f8..f91c5510bc7e 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2111,11 +2111,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  
>  	init_waitqueue_head(&osb->osb_mount_event);
>  
> -	status = ocfs2_resmap_init(osb, &osb->osb_la_resmap);
> -	if (status) {
> -		mlog_errno(status);
> -		goto bail;
> -	}
> +	ocfs2_resmap_init(osb, &osb->osb_la_resmap);
>  
>  	osb->vol_label = kmalloc(OCFS2_MAX_VOL_LABEL_LEN, GFP_KERNEL);
>  	if (!osb->vol_label) {

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
@ 2022-04-09 13:30   ` Joseph Qi via Ocfs2-devel
  2022-04-09 16:47     ` heming.zhao--- via Ocfs2-devel
  0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 13:30 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel



On 4/8/22 6:30 PM, Heming Zhao wrote:
> After this patch, when error, ocfs2_fill_super doesn't take care to
> release resources which are allocated in ocfs2_initialize_super.
> 
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
>  fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
>  1 file changed, 40 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index f91c5510bc7e..8443ba031dec 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out;
>  	}
>  
>  	sb->s_fs_info = osb;
> @@ -2084,7 +2084,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  		mlog(ML_ERROR, "Invalid number of node slots (%u)\n",
>  		     osb->max_slots);
>  		status = -EINVAL;
> -		goto bail;
> +		goto out;
>  	}
>  
>  	ocfs2_orphan_scan_init(osb);
> @@ -2093,7 +2093,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (status) {
>  		mlog(ML_ERROR, "Unable to initialize recovery state\n");
>  		mlog_errno(status);
> -		goto bail;
> +		goto out;
>  	}
>  
>  	init_waitqueue_head(&osb->checkpoint_event);
> @@ -2117,7 +2117,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb->vol_label) {
>  		mlog(ML_ERROR, "unable to alloc vol label\n");
>  		status = -ENOMEM;
> -		goto bail;
> +		goto out_recovery_map;
>  	}
>  
>  	osb->slot_recovery_generations =
> @@ -2126,7 +2126,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb->slot_recovery_generations) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_vol_label;
>  	}
>  
>  	init_waitqueue_head(&osb->osb_wipe_event);
> @@ -2136,7 +2136,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb->osb_orphan_wipes) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_slot_recovery_gen;
>  	}
>  
>  	osb->osb_rf_lock_tree = RB_ROOT;
> @@ -2152,13 +2152,13 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  		mlog(ML_ERROR, "couldn't mount because of unsupported "
>  		     "optional features (%x).\n", i);
>  		status = -EINVAL;
> -		goto bail;
> +		goto out_orphan_wipes;
>  	}
>  	if (!sb_rdonly(osb->sb) && (i = OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb, ~OCFS2_FEATURE_RO_COMPAT_SUPP))) {
>  		mlog(ML_ERROR, "couldn't mount RDWR because of "
>  		     "unsupported optional features (%x).\n", i);
>  		status = -EINVAL;
> -		goto bail;
> +		goto out_orphan_wipes;
>  	}
>  
>  	if (ocfs2_clusterinfo_valid(osb)) {
> @@ -2179,7 +2179,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  			     "cluster stack label (%s) \n",
>  			     osb->osb_cluster_stack);
>  			status = -EINVAL;
> -			goto bail;
> +			goto out_orphan_wipes;
>  		}
>  		memcpy(osb->osb_cluster_name,
>  			OCFS2_RAW_SB(di)->s_cluster_info.ci_cluster,
> @@ -2204,7 +2204,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!journal) {
>  		mlog(ML_ERROR, "unable to alloc journal\n");
>  		status = -ENOMEM;
> -		goto bail;
> +		goto out_orphan_wipes;
>  	}
>  	osb->journal = journal;
>  	journal->j_osb = osb;
> @@ -2231,7 +2231,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  		mlog(ML_ERROR, "Volume has invalid cluster size (%d)\n",
>  		     osb->s_clustersize);
>  		status = -EINVAL;
> -		goto bail;
> +		goto out_journal;
>  	}
>  
>  	total_blocks = ocfs2_clusters_to_blocks(osb->sb,
> @@ -2243,14 +2243,14 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  		mlog(ML_ERROR, "Volume too large "
>  		     "to mount safely on this system");
>  		status = -EFBIG;
> -		goto bail;
> +		goto out_journal;
>  	}
>  
>  	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
>  				 sizeof(di->id2.i_super.s_uuid))) {
>  		mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
>  		status = -ENOMEM;
> -		goto bail;
> +		goto out_journal;
>  	}
>  
>  	strlcpy(osb->vol_label, di->id2.i_super.s_label,
> @@ -2270,7 +2270,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb->osb_dlm_debug) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_uuid_str;
>  	}
>  
>  	atomic_set(&osb->vol_state, VOLUME_INIT);
> @@ -2279,7 +2279,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	status = ocfs2_init_global_system_inodes(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_dlm_out;
>  	}
>  
>  	/*
> @@ -2290,7 +2290,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!inode) {
>  		status = -EINVAL;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_system_inodes;
>  	}
>  
>  	osb->bitmap_blkno = OCFS2_I(inode)->ip_blkno;
> @@ -2303,16 +2303,38 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	status = ocfs2_init_slot_info(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_system_inodes;
>  	}
>  
>  	osb->ocfs2_wq = alloc_ordered_workqueue("ocfs2_wq", WQ_MEM_RECLAIM);
>  	if (!osb->ocfs2_wq) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> +		goto out_slot_info;
>  	}
>  
> -bail:
> +	return status;
> +
> +out_slot_info:
> +	ocfs2_free_slot_info(osb);
> +out_system_inodes:
> +	ocfs2_release_system_inodes(osb);
> +out_dlm_out:
> +	ocfs2_put_dlm_debug(osb->osb_dlm_debug);
> +out_uuid_str:
> +	kfree(osb->uuid_str);
> +out_journal:
> +	kfree(osb->journal);
> +out_orphan_wipes:
> +	kfree(osb->osb_orphan_wipes);
> +out_slot_recovery_gen:
> +	kfree(osb->slot_recovery_generations);
> +out_vol_label:
> +	kfree(osb->vol_label);
> +out_recovery_map:
> +	kfree(osb->recovery_map);
> +out:
> +	kfree(osb);

Should set osb to NULL here to prevent UAF in ocfs2_fill_super().

Thanks,
Joseph

>  	return status;
>  }
>  

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
@ 2022-04-09 13:46   ` Joseph Qi via Ocfs2-devel
  2022-04-10  3:58     ` heming.zhao--- via Ocfs2-devel
  2022-04-12  8:22     ` Heming Zhao via Ocfs2-devel
  0 siblings, 2 replies; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 13:46 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel



On 4/8/22 6:30 PM, Heming Zhao wrote:
> After this patch, when error, ocfs2_fill_super doesn't take care to
> release resources which are allocated in ocfs2_mount_volume.
> 
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
>  fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
>  1 file changed, 20 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 8443ba031dec..d4b7a29cb364 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
>  static int ocfs2_mount_volume(struct super_block *sb)
>  {
>  	int status = 0;
> -	int unlock_super = 0;
>  	struct ocfs2_super *osb = OCFS2_SB(sb);
>  
>  	if (ocfs2_is_hard_readonly(osb))
> -		goto leave;
> +		goto out;
>  
>  	mutex_init(&osb->obs_trim_fs_mutex);
>  
> @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
>  		if (status == -EBADR && ocfs2_userspace_stack(osb))
>  			mlog(ML_ERROR, "couldn't mount because cluster name on"
>  			" disk does not match the running cluster name.\n");
> -		goto leave;
> +		goto out;
>  	}
>  
>  	status = ocfs2_super_lock(osb, 1);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto leave;
> +		goto out_dlm;
>  	}
> -	unlock_super = 1;
>  
>  	/* This will load up the node map and add ourselves to it. */
>  	status = ocfs2_find_slot(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto leave;
> +		goto out_super_lock;
>  	}
>  
>  	/* load all node-local system inodes */
>  	status = ocfs2_init_local_system_inodes(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto leave;
> +		goto out_super_lock;
>  	}
>  
>  	status = ocfs2_check_volume(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto leave;
> +		goto out_system_inodes;
>  	}
>  
>  	status = ocfs2_truncate_log_init(osb);
> -	if (status < 0)
> +	if (status < 0) {
>  		mlog_errno(status);
> +		goto out_journal_shutdown;

ocfs2_check_volume() not only does journal initialization,
but also local alloc.

> +	}
>  
> -leave:
> -	if (unlock_super)
> -		ocfs2_super_unlock(osb, 1);
> +	ocfs2_super_unlock(osb, 1);
> +	return status;
Explicitly return 0 may be better, which means this is the
normal path.

Thanks,
Joseph

>
> +out_journal_shutdown:
> +	ocfs2_journal_shutdown(osb);
> +out_system_inodes:
> +	ocfs2_release_system_inodes(osb);
> +out_super_lock:
> +	ocfs2_super_unlock(osb, 1);
> +out_dlm:
> +	ocfs2_dlm_shutdown(osb, 0);
> +out:
>  	return status;
>  }
>  

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
  2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
@ 2022-04-09 14:02   ` Joseph Qi via Ocfs2-devel
  2022-04-10 10:25     ` heming.zhao--- via Ocfs2-devel
  2022-04-12  8:49     ` Heming Zhao via Ocfs2-devel
  0 siblings, 2 replies; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 14:02 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel



On 4/8/22 6:30 PM, Heming Zhao wrote:
> current ocfs2_fill_super error handling is very simple, and includes
> kernel crash possibility.
> 
> e.g.
> 
> ocfs2_fill_super
>   ocfs2_initialize_super // fails in check max_slots & return
> 
> Then osb->osb_mount_event is not properly initialized, and then will
> crash at wake_up(&osb->osb_mount_event) in goto label read_super_error.
> 
With patch 3, this case won't exist any more.

> For fixing the issue and enhancing error handling, we rewrite error
> handling of ocfs2_fill_super.
> 
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
>  fs/ocfs2/super.c | 67 ++++++++++++++++++++++++------------------------
>  1 file changed, 33 insertions(+), 34 deletions(-)
> 
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index d4b7a29cb364..c3efd9ad49ac 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
>  		status = -EINVAL;
> -		goto read_super_error;
> +		goto out;
>  	}
>  
>  	/* probe for superblock */
>  	status = ocfs2_sb_probe(sb, &bh, &sector_size, &stats);
>  	if (status < 0) {
>  		mlog(ML_ERROR, "superblock probe failed!\n");
> -		goto read_super_error;
> +		goto out;
>  	}
>  
>  	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
> -	osb = OCFS2_SB(sb);
> -	if (status < 0) {
> -		mlog_errno(status);
> -		goto read_super_error;
> -	}
>  	brelse(bh);
>  	bh = NULL;
> +	if (status < 0)
> +		goto out;
> +
> +	osb = OCFS2_SB(sb);
>  
>  	if (!ocfs2_check_set_options(sb, &parsed_options)) {
>  		status = -EINVAL;
> -		goto read_super_error;
> +		goto out_super;
>  	}
>  	osb->s_mount_opt = parsed_options.mount_opt;
>  	osb->s_atime_quantum = parsed_options.atime_quantum;
> @@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	status = ocfs2_verify_userspace_stack(osb, &parsed_options);
>  	if (status)
> -		goto read_super_error;
> +		goto out_super;
>  
>  	sb->s_magic = OCFS2_SUPER_MAGIC;
>  
> @@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  			status = -EACCES;
>  			mlog(ML_ERROR, "Readonly device detected but readonly "
>  			     "mount was not specified.\n");
> -			goto read_super_error;
> +			goto out_super;
>  		}
>  
>  		/* You should not be able to start a local heartbeat
> @@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  			status = -EROFS;
>  			mlog(ML_ERROR, "Local heartbeat specified on readonly "
>  			     "device.\n");
> -			goto read_super_error;
> +			goto out_super;
>  		}
>  
>  		status = ocfs2_check_journals_nolocks(osb);
> @@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  				mlog(ML_ERROR, "Recovery required on readonly "
>  				     "file system, but write access is "
>  				     "unavailable.\n");
> -			else
> -				mlog_errno(status);
> -			goto read_super_error;
> +			goto out_super;
>  		}
>  
>  		ocfs2_set_ro_flag(osb, 1);
> @@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  	}
>  
>  	status = ocfs2_verify_heartbeat(osb);
> -	if (status < 0) {
> -		mlog_errno(status);
> -		goto read_super_error;
> -	}
> +	if (status < 0)
> +		goto out_super;
>  
>  	osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
>  						 ocfs2_debugfs_root);
> @@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	status = ocfs2_mount_volume(sb);
>  	if (status < 0)
> -		goto read_super_error;
> +		goto out_debugfs;
>  
>  	if (osb->root_inode)
>  		inode = igrab(osb->root_inode);
>  
>  	if (!inode) {
>  		status = -EIO;
> -		mlog_errno(status);
> -		goto read_super_error;
> +		goto out_journal;
>  	}
>  
>  	osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
> @@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  	if (!osb->osb_dev_kset) {
>  		status = -ENOMEM;
>  		mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
> -		goto read_super_error;
> +		goto out_journal;
>  	}
>  
>  	/* Create filecheck sysfs related directories/files at
> @@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  		status = -ENOMEM;
>  		mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
>  			"/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
> -		goto read_super_error;
> +		goto out_journal;
>  	}
>  
>  	root = d_make_root(inode);
>  	if (!root) {
>  		status = -ENOMEM;
> -		mlog_errno(status);
> -		goto read_super_error;
> +		goto out_journal;
>  	}
>  
>  	sb->s_root = root;
> @@ -1178,17 +1171,23 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	return status;
>  
> -read_super_error:
> -	brelse(bh);
> +out_journal:

rename it to out_dismount?

> +	mlog_errno(status);
> +	atomic_set(&osb->vol_state, VOLUME_DISABLED);
> +	wake_up(&osb->osb_mount_event);
> +	ocfs2_dismount_volume(sb, 1);
>  
> -	if (status)
> -		mlog_errno(status);
> +	return status;

Or remove mlog_errno() before and then just goto out?

>  
> -	if (osb) {
> -		atomic_set(&osb->vol_state, VOLUME_DISABLED);
> -		wake_up(&osb->osb_mount_event);
> -		ocfs2_dismount_volume(sb, 1);
> -	}
> +out_debugfs:
> +	debugfs_remove_recursive(osb->osb_debug_root);
> +out_super:
> +	ocfs2_release_system_inodes(osb);
> +	kfree(osb->recovery_map);
> +	ocfs2_delete_osb(osb);
> +	kfree(osb);

Seems we have to take care all resources in ocfs2_initialize_super()?

> +out:
> +	mlog_errno(status);
>  
As I described in your previous mail, ocfs2_sb_probe() may return error
with non-NULL bh, so we should brelse here.

Thanks,
Joseph

>  	return status;
>  }

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue
  2022-04-09 13:11   ` Joseph Qi via Ocfs2-devel
@ 2022-04-09 16:28     ` heming.zhao--- via Ocfs2-devel
  2022-04-10 12:00       ` Joseph Qi via Ocfs2-devel
  0 siblings, 1 reply; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-09 16:28 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On 4/9/22 21:11, Joseph Qi wrote:
> Suggest rename the subject, something like
> ocfs2: fix mount crash if journal is not alloced

OK.

> 
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
>> journal init later than before, it makes NULL pointer access in free
>> routine.
>>
>> Crash flow:
>>
>> ocfs2_fill_super
>>   + ocfs2_mount_volume
>>   |  + ocfs2_dlm_init //fail & return, osb->journal is NULL.
>>   |  + ...
>>   |  + ocfs2_check_volume //no chance to init osb->journal
>>   |
>>   + ...
>>   + ocfs2_dismount_volume
>>      ocfs2_release_system_inodes
>>        ...
>>         evict
>>          ...
>>           ocfs2_clear_inode
>>            ocfs2_checkpoint_inode
>>             ocfs2_ci_fully_checkpointed
>>              time_after(journal->j_trans_id, ci->ci_last_trans)
>>               + journal is empty, crash!
>>
>> For fixing, there are three solutions:
>>
>> 1> Revert commit da5e7c87827e8
>>
>> For avoiding kernel crash, this make sense for us. We only concerned
>> whether there has any non-system inode access before dlm init. The
>> answer is NO. And all journal replay/recovery handling happen after
>> dlm & journal init done. So this method is not graceful but workable.
>>
>> 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)
>>
>> The fix code is special for mounting phase, but it will continue
>> working after mounting stage. In another word, this method adds useless
>> code in normal inode free flow.
>>
>> 3> Do directly free inode in mounting phase
>>
>> This method is brutal/complex and may introduce unsafe code, currently
>> maintainer didn't like.
>>
>> At last, we chose method <1> and partly reverted commit da5e7c87827e8.
>> We reverted journal init code, and kept cleanup code flow.
>>
>> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>>   fs/ocfs2/inode.c   |  4 ++--
>>   fs/ocfs2/journal.c | 21 +--------------------
>>   fs/ocfs2/super.c   | 33 +++++++++++++++++++++++++++++++++
>>   3 files changed, 36 insertions(+), 22 deletions(-)
>>
>> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
>> index 5739dc301569..bb116c39b581 100644
>> --- a/fs/ocfs2/inode.c
>> +++ b/fs/ocfs2/inode.c
>> @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>>   	struct inode *inode = NULL;
>>   	struct super_block *sb = osb->sb;
>>   	struct ocfs2_find_inode_args args;
>> +	journal_t *journal = osb->journal->j_journal;
>>   
>>   	trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
>>   			       sysfile_type);
>> @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>>   	 * part of the transaction - the inode could have been reclaimed and
>>   	 * now it is reread from disk.
>>   	 */
>> -	if (osb->journal) {
>> +	if (journal) {
>>   		transaction_t *transaction;
>>   		tid_t tid;
>>   		struct ocfs2_inode_info *oi = OCFS2_I(inode);
>> -		journal_t *journal = osb->journal->j_journal;
>>   
>>   		read_lock(&journal->j_state_lock);
>>   		if (journal->j_running_transaction)
>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>> index 1887a2708709..afb85de3bb60 100644
>> --- a/fs/ocfs2/journal.c
>> +++ b/fs/ocfs2/journal.c
>> @@ -815,30 +815,11 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
>>   	int status = -1;
>>   	struct inode *inode = NULL; /* the journal inode */
>>   	journal_t *j_journal = NULL;
>> -	struct ocfs2_journal *journal = NULL;
>> +	struct ocfs2_journal *journal = osb->journal;
>>   	struct ocfs2_dinode *di = NULL;
>>   	struct buffer_head *bh = NULL;
>>   	int inode_lock = 0;
>>   
>> -	/* initialize our journal structure */
>> -	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>> -	if (!journal) {
>> -		mlog(ML_ERROR, "unable to alloc journal\n");
>> -		status = -ENOMEM;
>> -		goto done;
>> -	}
>> -	osb->journal = journal;
>> -	journal->j_osb = osb;
>> -
>> -	atomic_set(&journal->j_num_trans, 0);
>> -	init_rwsem(&journal->j_trans_barrier);
>> -	init_waitqueue_head(&journal->j_checkpointed);
>> -	spin_lock_init(&journal->j_lock);
>> -	journal->j_trans_id = 1UL;
>> -	INIT_LIST_HEAD(&journal->j_la_cleanups);
>> -	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>> -	journal->j_state = OCFS2_JOURNAL_FREE;
>> -
>>   	/* already have the inode for our journal */
>>   	inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
>>   					    osb->slot_num);
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index 477cdf94122e..5f63a2333e52 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>   	int i, cbits, bbits;
>>   	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>   	struct inode *inode = NULL;
>> +	struct ocfs2_journal *journal;
>>   	struct ocfs2_super *osb;
>>   	u64 total_blocks;
>>   
>> @@ -2195,6 +2196,32 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>   
>>   	get_random_bytes(&osb->s_next_generation, sizeof(u32));
>>   
>> +	/* FIXME
>> +	 * This should be done in ocfs2_journal_init(), but unknown
>> +	 * ordering issues will cause the filesystem to crash.
>> +	 * If anyone wants to figure out what part of the code
>> +	 * refers to osb->journal before ocfs2_journal_init() is run,
>> +	 * be my guest.
>> +	 */
>> +	/* initialize our journal structure */
>> +	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>> +	if (!journal) {
>> +		mlog(ML_ERROR, "unable to alloc journal\n");
>> +		status = -ENOMEM;
>> +		goto bail;
>> +	}
>> +	osb->journal = journal;
>> +	journal->j_osb = osb;
>> +
>> +	atomic_set(&journal->j_num_trans, 0);
>> +	init_rwsem(&journal->j_trans_barrier);
>> +	init_waitqueue_head(&journal->j_checkpointed);
>> +	spin_lock_init(&journal->j_lock);
>> +	journal->j_trans_id = 1UL;
>> +	INIT_LIST_HEAD(&journal->j_la_cleanups);
>> +	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>> +	journal->j_state = OCFS2_JOURNAL_FREE;
>> +
> We may fold these into a new function, such as ocfs2_journal_alloc().
> 

OK, will create a new function in journal.c
For this area FIXME comment, the legacy comment humor & vague. In my view, we
already got the order issue clearly, why not we change it as below?
     /*
      * FIXME
      * This should be done in ocfs2_journal_init(), but any inode
      * writes back operation will cause the filesystem to crash.
      */

Thanks,
Heming
> 
>>   	INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
>>   	init_llist_head(&osb->dquot_drop_list);
>>   
>> @@ -2483,6 +2510,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
>>   
>>   	kfree(osb->osb_orphan_wipes);
>>   	kfree(osb->slot_recovery_generations);
>> +	/* FIXME
>> +	 * This belongs in journal shutdown, but because we have to
>> +	 * allocate osb->journal at the middle of ocfs2_initialize_super(),
>> +	 * we free it here.
>> +	 */
>> +	kfree(osb->journal);
>>   	kfree(osb->local_alloc_copy);
>>   	kfree(osb->uuid_str);
>>   	kfree(osb->vol_label);
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init
  2022-04-09 13:14   ` Joseph Qi via Ocfs2-devel
@ 2022-04-09 16:29     ` heming.zhao--- via Ocfs2-devel
  0 siblings, 0 replies; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-09 16:29 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On 4/9/22 21:14, Joseph Qi wrote:
> 
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> For rewrite error handling of ocfs2_initialize_super, change this
>> function return type from int to void.
>>
> Since ocfs2_resmap_init() always return 0, change it to void.

OK.

> 
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>>   fs/ocfs2/reservations.c | 4 +---
>>   fs/ocfs2/reservations.h | 5 ++---
>>   fs/ocfs2/super.c        | 6 +-----
>>   3 files changed, 4 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/ocfs2/reservations.c b/fs/ocfs2/reservations.c
>> index 769e466887b0..a9d1296d736d 100644
>> --- a/fs/ocfs2/reservations.c
>> +++ b/fs/ocfs2/reservations.c
>> @@ -198,7 +198,7 @@ void ocfs2_resv_set_type(struct ocfs2_alloc_reservation *resv,
>>   	resv->r_flags |= flags;
>>   }
>>   
>> -int ocfs2_resmap_init(struct ocfs2_super *osb,
>> +void ocfs2_resmap_init(struct ocfs2_super *osb,
>>   		      struct ocfs2_reservation_map *resmap)
>>   {
>>   	memset(resmap, 0, sizeof(*resmap));
>> @@ -207,8 +207,6 @@ int ocfs2_resmap_init(struct ocfs2_super *osb,
>>   	resmap->m_reservations = RB_ROOT;
>>   	/* m_bitmap_len is initialized to zero by the above memset. */
>>   	INIT_LIST_HEAD(&resmap->m_lru);
>> -
>> -	return 0;
>>   }
>>   
>>   static void ocfs2_resv_mark_lru(struct ocfs2_reservation_map *resmap,
>> diff --git a/fs/ocfs2/reservations.h b/fs/ocfs2/reservations.h
>> index 677c50663595..532d42eb6db7 100644
>> --- a/fs/ocfs2/reservations.h
>> +++ b/fs/ocfs2/reservations.h
>> @@ -78,10 +78,9 @@ void ocfs2_resv_discard(struct ocfs2_reservation_map *resmap,
>>    * @ops: unused for now
>>    * @max_bitmap_bytes: Maximum size of the bitmap (typically blocksize)
>>    *
>> - * Only possible return value other than '0' is -ENOMEM for failure to
>> - * allocation mirror bitmap.
>> + * This function does initialize resmap job.
>>    */
> This comments is meaningless, so just remove it looks better.

OK.

- Heming

> 
>> -int ocfs2_resmap_init(struct ocfs2_super *osb,
>> +void ocfs2_resmap_init(struct ocfs2_super *osb,
>>   		      struct ocfs2_reservation_map *resmap);
>>   
>>   /**
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index 8391592817f8..f91c5510bc7e 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -2111,11 +2111,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>   
>>   	init_waitqueue_head(&osb->osb_mount_event);
>>   
>> -	status = ocfs2_resmap_init(osb, &osb->osb_la_resmap);
>> -	if (status) {
>> -		mlog_errno(status);
>> -		goto bail;
>> -	}
>> +	ocfs2_resmap_init(osb, &osb->osb_la_resmap);
>>   
>>   	osb->vol_label = kmalloc(OCFS2_MAX_VOL_LABEL_LEN, GFP_KERNEL);
>>   	if (!osb->vol_label) {
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
  2022-04-09 13:30   ` Joseph Qi via Ocfs2-devel
@ 2022-04-09 16:47     ` heming.zhao--- via Ocfs2-devel
  2022-04-11  1:56       ` Joseph Qi via Ocfs2-devel
  0 siblings, 1 reply; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-09 16:47 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On 4/9/22 21:30, Joseph Qi wrote:
> 
> 
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> After this patch, when error, ocfs2_fill_super doesn't take care to
>> release resources which are allocated in ocfs2_initialize_super.
>>
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>>   fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
>>   1 file changed, 40 insertions(+), 18 deletions(-)
>>
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index f91c5510bc7e..8443ba031dec 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>   	if (!osb) {
>>   		status = -ENOMEM;
>>   		mlog_errno(status);
>> -		goto bail;
>> +		goto out;
>>   	}
>> ... ...
>>   
>> -bail:
>> +	return status;
>> +
>> +out_slot_info:
>> +	ocfs2_free_slot_info(osb);
>> +out_system_inodes:
>> +	ocfs2_release_system_inodes(osb);
>> +out_dlm_out:
>> +	ocfs2_put_dlm_debug(osb->osb_dlm_debug);
>> +out_uuid_str:
>> +	kfree(osb->uuid_str);
>> +out_journal:
>> +	kfree(osb->journal);
>> +out_orphan_wipes:
>> +	kfree(osb->osb_orphan_wipes);
>> +out_slot_recovery_gen:
>> +	kfree(osb->slot_recovery_generations);
>> +out_vol_label:
>> +	kfree(osb->vol_label);
>> +out_recovery_map:
>> +	kfree(osb->recovery_map);
>> +out:
>> +	kfree(osb);
> 
> Should set osb to NULL here to prevent UAF in ocfs2_fill_super().
> 

Your concern only valid with patch 1/5+2/5+3/5, but after 5/5, the UAF won't be
triggered. ocfs2_initialize_super() failure will directly jump (by "goto out")
to return.

Thanks,
Heming
> 
>>   	return status;
>>   }
>>   
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
  2022-04-09 13:46   ` Joseph Qi via Ocfs2-devel
@ 2022-04-10  3:58     ` heming.zhao--- via Ocfs2-devel
  2022-04-12  8:22     ` Heming Zhao via Ocfs2-devel
  1 sibling, 0 replies; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-10  3:58 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On 4/9/22 21:46, Joseph Qi wrote:
> 
> 
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> After this patch, when error, ocfs2_fill_super doesn't take care to
>> release resources which are allocated in ocfs2_mount_volume.
>>
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>>   fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
>>   1 file changed, 20 insertions(+), 12 deletions(-)
>>
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index 8443ba031dec..d4b7a29cb364 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
>>   static int ocfs2_mount_volume(struct super_block *sb)
>>   {
>>   	int status = 0;
>> -	int unlock_super = 0;
>>   	struct ocfs2_super *osb = OCFS2_SB(sb);
>>   
>>   	if (ocfs2_is_hard_readonly(osb))
>> -		goto leave;
>> +		goto out;
>>   
>>   	mutex_init(&osb->obs_trim_fs_mutex);
>>   
>> @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
>>   		if (status == -EBADR && ocfs2_userspace_stack(osb))
>>   			mlog(ML_ERROR, "couldn't mount because cluster name on"
>>   			" disk does not match the running cluster name.\n");
>> -		goto leave;
>> +		goto out;
>>   	}
>>   
>>   	status = ocfs2_super_lock(osb, 1);
>>   	if (status < 0) {
>>   		mlog_errno(status);
>> -		goto leave;
>> +		goto out_dlm;
>>   	}
>> -	unlock_super = 1;
>>   
>>   	/* This will load up the node map and add ourselves to it. */
>>   	status = ocfs2_find_slot(osb);
>>   	if (status < 0) {
>>   		mlog_errno(status);
>> -		goto leave;
>> +		goto out_super_lock;
>>   	}
>>   
>>   	/* load all node-local system inodes */
>>   	status = ocfs2_init_local_system_inodes(osb);
>>   	if (status < 0) {
>>   		mlog_errno(status);
>> -		goto leave;
>> +		goto out_super_lock;
>>   	}
>>   
>>   	status = ocfs2_check_volume(osb);
>>   	if (status < 0) {
>>   		mlog_errno(status);
>> -		goto leave;
>> +		goto out_system_inodes;
>>   	}
>>   
>>   	status = ocfs2_truncate_log_init(osb);
>> -	if (status < 0)
>> +	if (status < 0) {
>>   		mlog_errno(status);
>> +		goto out_journal_shutdown;
> 
> ocfs2_check_volume() not only does journal initialization,
> but also local alloc.

I agree and already handled this case. please read following explanation.




call flow:



ocfs2_fill_super

  ocfs2_mount_volume

   + ocfs2_init_local_system_inodes //[1]

   + ocfs2_check_volume

      ocfs2_load_local_alloc //may fail



If ocfs2_load_local_alloc fails, only "//local_alloc:%04d" is bad.

And other alloced inodes (from [1]) still exist.


also note, the sys inodes release function is ocfs2_release_system_inodes(),

which could handle both osb->global_system_inodes[] & osb->local_system_inodes
[]
are NULL pointer case.



the cleanup job will be done in both places:
- ocfs2_mount_volume (patch 4/5)

- ocfs2_fill_super (patch 5/5)



related code with patch 4/5:



```ocfs2_mount_volume()


     ... ...

     status = ocfs2_check_volume(osb);

     if (status < 0) {

         mlog_errno(status);



         //need to release global/local sys inodes (by ocfs2_release_system_inodes())

         // which alloced by above [1].


         goto out_system_inodes;

     }

     status = ocfs2_truncate_log_init(osb);

     if (status < 0) {

         mlog_errno(status);



         //need to release journal: by ocfs2_journal_shutdown()

         //need to release global/local sys inodes: by ocfs2_release_system_inodes()

         goto out_journal_shutdown;

     }

     ocfs2_super_unlock(osb, 1);

     return 0;


out_journal_shutdown:

     ocfs2_journal_shutdown(osb); //through to label "out_system_inodes".

out_system_inodes:

     ocfs2_release_system_inodes(osb);

out_super_lock:

     ocfs2_super_unlock(osb, 1);

out_dlm:

     ocfs2_dlm_shutdown(osb, 0);

out:
     return status;

}

```

> 
>> +	}
>>   
>> -leave:
>> -	if (unlock_super)
>> -		ocfs2_super_unlock(osb, 1);
>> +	ocfs2_super_unlock(osb, 1);
>> +	return status;
> Explicitly return 0 may be better, which means this is the
> normal path.

OK

Thanks,
Heming

> 
>>
>> +out_journal_shutdown:
>> +	ocfs2_journal_shutdown(osb);
>> +out_system_inodes:
>> +	ocfs2_release_system_inodes(osb);
>> +out_super_lock:
>> +	ocfs2_super_unlock(osb, 1);
>> +out_dlm:
>> +	ocfs2_dlm_shutdown(osb, 0);
>> +out:
>>   	return status;
>>   }
>>   
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
  2022-04-09 14:02   ` Joseph Qi via Ocfs2-devel
@ 2022-04-10 10:25     ` heming.zhao--- via Ocfs2-devel
  2022-04-12  8:49     ` Heming Zhao via Ocfs2-devel
  1 sibling, 0 replies; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-10 10:25 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On 4/9/22 22:02, Joseph Qi wrote:
> 
> 
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> current ocfs2_fill_super error handling is very simple, and includes
>> kernel crash possibility.
>>
>> e.g.
>>
>> ocfs2_fill_super
>>    ocfs2_initialize_super // fails in check max_slots & return
>>
>> Then osb->osb_mount_event is not properly initialized, and then will
>> crash at wake_up(&osb->osb_mount_event) in goto label read_super_error.
>>
> With patch 3, this case won't exist any more.

Agree. After patch 3, ocfs2_initialize_super() will free osb before returning

failure, and this issue won't be triggered.
I will change it in next version.

> 
>> For fixing the issue and enhancing error handling, we rewrite error
>> handling of ocfs2_fill_super.
>>
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>>   fs/ocfs2/super.c | 67 ++++++++++++++++++++++++------------------------
>>   1 file changed, 33 insertions(+), 34 deletions(-)
>>
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index d4b7a29cb364..c3efd9ad49ac 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   
>>   	if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
>>   		status = -EINVAL;
>> -		goto read_super_error;
>> +		goto out;
>>   	}
>>   
>>   	/* probe for superblock */
>>   	status = ocfs2_sb_probe(sb, &bh, &sector_size, &stats);
>>   	if (status < 0) {
>>   		mlog(ML_ERROR, "superblock probe failed!\n");
>> -		goto read_super_error;
>> +		goto out;
>>   	}
>>   
>>   	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
>> -	osb = OCFS2_SB(sb);
>> -	if (status < 0) {
>> -		mlog_errno(status);
>> -		goto read_super_error;
>> -	}
>>   	brelse(bh);
>>   	bh = NULL;
>> +	if (status < 0)
>> +		goto out;
>> +
>> +	osb = OCFS2_SB(sb);
>>   
>>   	if (!ocfs2_check_set_options(sb, &parsed_options)) {
>>   		status = -EINVAL;
>> -		goto read_super_error;
>> +		goto out_super;
>>   	}
>>   	osb->s_mount_opt = parsed_options.mount_opt;
>>   	osb->s_atime_quantum = parsed_options.atime_quantum;
>> @@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   
>>   	status = ocfs2_verify_userspace_stack(osb, &parsed_options);
>>   	if (status)
>> -		goto read_super_error;
>> +		goto out_super;
>>   
>>   	sb->s_magic = OCFS2_SUPER_MAGIC;
>>   
>> @@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   			status = -EACCES;
>>   			mlog(ML_ERROR, "Readonly device detected but readonly "
>>   			     "mount was not specified.\n");
>> -			goto read_super_error;
>> +			goto out_super;
>>   		}
>>   
>>   		/* You should not be able to start a local heartbeat
>> @@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   			status = -EROFS;
>>   			mlog(ML_ERROR, "Local heartbeat specified on readonly "
>>   			     "device.\n");
>> -			goto read_super_error;
>> +			goto out_super;
>>   		}
>>   
>>   		status = ocfs2_check_journals_nolocks(osb);
>> @@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   				mlog(ML_ERROR, "Recovery required on readonly "
>>   				     "file system, but write access is "
>>   				     "unavailable.\n");
>> -			else
>> -				mlog_errno(status);
>> -			goto read_super_error;
>> +			goto out_super;
>>   		}
>>   
>>   		ocfs2_set_ro_flag(osb, 1);
>> @@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   	}
>>   
>>   	status = ocfs2_verify_heartbeat(osb);
>> -	if (status < 0) {
>> -		mlog_errno(status);
>> -		goto read_super_error;
>> -	}
>> +	if (status < 0)
>> +		goto out_super;
>>   
>>   	osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
>>   						 ocfs2_debugfs_root);
>> @@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   
>>   	status = ocfs2_mount_volume(sb);
>>   	if (status < 0)
>> -		goto read_super_error;
>> +		goto out_debugfs;
>>   
>>   	if (osb->root_inode)
>>   		inode = igrab(osb->root_inode);
>>   
>>   	if (!inode) {
>>   		status = -EIO;
>> -		mlog_errno(status);
>> -		goto read_super_error;
>> +		goto out_journal;
>>   	}
>>   
>>   	osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
>> @@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   	if (!osb->osb_dev_kset) {
>>   		status = -ENOMEM;
>>   		mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
>> -		goto read_super_error;
>> +		goto out_journal;
>>   	}
>>   
>>   	/* Create filecheck sysfs related directories/files at
>> @@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   		status = -ENOMEM;
>>   		mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
>>   			"/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
>> -		goto read_super_error;
>> +		goto out_journal;
>>   	}
>>   
>>   	root = d_make_root(inode);
>>   	if (!root) {
>>   		status = -ENOMEM;
>> -		mlog_errno(status);
>> -		goto read_super_error;
>> +		goto out_journal;
>>   	}
>>   
>>   	sb->s_root = root;
>> @@ -1178,17 +1171,23 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>   
>>   	return status;
>>   
>> -read_super_error:
>> -	brelse(bh);
>> +out_journal:
> 
> rename it to out_dismount?

agree.

> 
>> +	mlog_errno(status);
>> +	atomic_set(&osb->vol_state, VOLUME_DISABLED);
>> +	wake_up(&osb->osb_mount_event);
>> +	ocfs2_dismount_volume(sb, 1);
>>   
>> -	if (status)
>> -		mlog_errno(status);
>> +	return status;
> 
> Or remove mlog_errno() before and then just goto out?

your mean: at the end of lable "out_journal", call "goto out"?

if yes, no problem for me.

> 
>>   
>> -	if (osb) {
>> -		atomic_set(&osb->vol_state, VOLUME_DISABLED);
>> -		wake_up(&osb->osb_mount_event);
>> -		ocfs2_dismount_volume(sb, 1);
>> -	}
>> +out_debugfs:
>> +	debugfs_remove_recursive(osb->osb_debug_root);
>> +out_super:
>> +	ocfs2_release_system_inodes(osb);
>> +	kfree(osb->recovery_map);
>> +	ocfs2_delete_osb(osb);
>> +	kfree(osb);
> 
> Seems we have to take care all resources in ocfs2_initialize_super()?

Yes, I checked every lines.



for ocfs2_initialize_super(), I lists all my found resources:

(list in time order):

   osb, osb->recovery_map, osb->vol_label, osb->slot_recovery_generations,

   osb->osb_orphan_wipes, osb->journal, osb->uuid_str

   - all above have itselves out_xx label.

   osb->osb_dlm_debug & dlm sysfs stuffs.

   - release by out_dlm_out

   system inodes (by ocfs2_init_global_system_inodes)

   - release by out_system_inodes

   osb->slot_info, osb->slot_info->si_bh

   - release by out_slot_info

   workqueue:osb->ocfs2_wq

   - will be freed in ocfs2_fill_super label "out_super": ocfs2_delete_osb().
     or be freed in ocfs2_dismount_volume => ocfs2_delete_osb().

> 
>> +out:
>> +	mlog_errno(status);
>>   
> As I described in your previous mail, ocfs2_sb_probe() may return error
> with non-NULL bh, so we should brelse here.
> 

I also explained why I dropped the brelse(bh), maybe anti-spam system ate my mail.

(I found my emails also disappeared in other mailling lists.)



I re-explain here. below code based on patch 5/5.

please note <2> & <4>.



```

static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)

{



     ... ...

     if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {

         status = -EINVAL;

         goto out;

     }



     //hm: <1> bh is alloced/created from here.

     /* probe for superblock */

     status = ocfs2_sb_probe(sb, &bh, &sector_size, &stats);

     if (status < 0) {

         mlog(ML_ERROR, "superblock probe failed!\n");

         //hm: <2> if error, ocfs2_sb_probe() will release bh. So "out" label

         //doesn't need to call brelse(bh).

         goto out;

     }



     status = ocfs2_initialize_super(sb, bh, sector_size, &stats);

     brelse(bh); //hm: <3> release bh before assessment status.

     bh = NULL;

     if (status < 0)

         goto out; //hm: <4> if error, bh is already released before "if()",

                   // So "out" label does't need to call brelse(bh)



     osb = OCFS2_SB(sb);



     ... ...



out:

     mlog_errno(status);



     return status;

}

```

Thanks,
Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue
  2022-04-09 16:28     ` heming.zhao--- via Ocfs2-devel
@ 2022-04-10 12:00       ` Joseph Qi via Ocfs2-devel
  0 siblings, 0 replies; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-10 12:00 UTC (permalink / raw)
  To: heming.zhao, ocfs2-devel



On 4/10/22 12:28 AM, heming.zhao@suse.com wrote:
> On 4/9/22 21:11, Joseph Qi wrote:
>> Suggest rename the subject, something like
>> ocfs2: fix mount crash if journal is not alloced
> 
> OK.
> 
>>
>> On 4/8/22 6:30 PM, Heming Zhao wrote:
>>> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
>>> journal init later than before, it makes NULL pointer access in free
>>> routine.
>>>
>>> Crash flow:
>>>
>>> ocfs2_fill_super
>>>   + ocfs2_mount_volume
>>>   |  + ocfs2_dlm_init //fail & return, osb->journal is NULL.
>>>   |  + ...
>>>   |  + ocfs2_check_volume //no chance to init osb->journal
>>>   |
>>>   + ...
>>>   + ocfs2_dismount_volume
>>>      ocfs2_release_system_inodes
>>>        ...
>>>         evict
>>>          ...
>>>           ocfs2_clear_inode
>>>            ocfs2_checkpoint_inode
>>>             ocfs2_ci_fully_checkpointed
>>>              time_after(journal->j_trans_id, ci->ci_last_trans)
>>>               + journal is empty, crash!
>>>
>>> For fixing, there are three solutions:
>>>
>>> 1> Revert commit da5e7c87827e8
>>>
>>> For avoiding kernel crash, this make sense for us. We only concerned
>>> whether there has any non-system inode access before dlm init. The
>>> answer is NO. And all journal replay/recovery handling happen after
>>> dlm & journal init done. So this method is not graceful but workable.
>>>
>>> 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)
>>>
>>> The fix code is special for mounting phase, but it will continue
>>> working after mounting stage. In another word, this method adds useless
>>> code in normal inode free flow.
>>>
>>> 3> Do directly free inode in mounting phase
>>>
>>> This method is brutal/complex and may introduce unsafe code, currently
>>> maintainer didn't like.
>>>
>>> At last, we chose method <1> and partly reverted commit da5e7c87827e8.
>>> We reverted journal init code, and kept cleanup code flow.
>>>
>>> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
>>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>>> ---
>>>   fs/ocfs2/inode.c   |  4 ++--
>>>   fs/ocfs2/journal.c | 21 +--------------------
>>>   fs/ocfs2/super.c   | 33 +++++++++++++++++++++++++++++++++
>>>   3 files changed, 36 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
>>> index 5739dc301569..bb116c39b581 100644
>>> --- a/fs/ocfs2/inode.c
>>> +++ b/fs/ocfs2/inode.c
>>> @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>>>       struct inode *inode = NULL;
>>>       struct super_block *sb = osb->sb;
>>>       struct ocfs2_find_inode_args args;
>>> +    journal_t *journal = osb->journal->j_journal;
>>>         trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
>>>                      sysfile_type);
>>> @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>>>        * part of the transaction - the inode could have been reclaimed and
>>>        * now it is reread from disk.
>>>        */
>>> -    if (osb->journal) {
>>> +    if (journal) {
>>>           transaction_t *transaction;
>>>           tid_t tid;
>>>           struct ocfs2_inode_info *oi = OCFS2_I(inode);
>>> -        journal_t *journal = osb->journal->j_journal;
>>>             read_lock(&journal->j_state_lock);
>>>           if (journal->j_running_transaction)
>>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>>> index 1887a2708709..afb85de3bb60 100644
>>> --- a/fs/ocfs2/journal.c
>>> +++ b/fs/ocfs2/journal.c
>>> @@ -815,30 +815,11 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
>>>       int status = -1;
>>>       struct inode *inode = NULL; /* the journal inode */
>>>       journal_t *j_journal = NULL;
>>> -    struct ocfs2_journal *journal = NULL;
>>> +    struct ocfs2_journal *journal = osb->journal;
>>>       struct ocfs2_dinode *di = NULL;
>>>       struct buffer_head *bh = NULL;
>>>       int inode_lock = 0;
>>>   -    /* initialize our journal structure */
>>> -    journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>>> -    if (!journal) {
>>> -        mlog(ML_ERROR, "unable to alloc journal\n");
>>> -        status = -ENOMEM;
>>> -        goto done;
>>> -    }
>>> -    osb->journal = journal;
>>> -    journal->j_osb = osb;
>>> -
>>> -    atomic_set(&journal->j_num_trans, 0);
>>> -    init_rwsem(&journal->j_trans_barrier);
>>> -    init_waitqueue_head(&journal->j_checkpointed);
>>> -    spin_lock_init(&journal->j_lock);
>>> -    journal->j_trans_id = 1UL;
>>> -    INIT_LIST_HEAD(&journal->j_la_cleanups);
>>> -    INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>>> -    journal->j_state = OCFS2_JOURNAL_FREE;
>>> -
>>>       /* already have the inode for our journal */
>>>       inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
>>>                           osb->slot_num);
>>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>>> index 477cdf94122e..5f63a2333e52 100644
>>> --- a/fs/ocfs2/super.c
>>> +++ b/fs/ocfs2/super.c
>>> @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>>       int i, cbits, bbits;
>>>       struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>>       struct inode *inode = NULL;
>>> +    struct ocfs2_journal *journal;
>>>       struct ocfs2_super *osb;
>>>       u64 total_blocks;
>>>   @@ -2195,6 +2196,32 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>>         get_random_bytes(&osb->s_next_generation, sizeof(u32));
>>>   +    /* FIXME
>>> +     * This should be done in ocfs2_journal_init(), but unknown
>>> +     * ordering issues will cause the filesystem to crash.
>>> +     * If anyone wants to figure out what part of the code
>>> +     * refers to osb->journal before ocfs2_journal_init() is run,
>>> +     * be my guest.
>>> +     */
>>> +    /* initialize our journal structure */
>>> +    journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>>> +    if (!journal) {
>>> +        mlog(ML_ERROR, "unable to alloc journal\n");
>>> +        status = -ENOMEM;
>>> +        goto bail;
>>> +    }
>>> +    osb->journal = journal;
>>> +    journal->j_osb = osb;
>>> +
>>> +    atomic_set(&journal->j_num_trans, 0);
>>> +    init_rwsem(&journal->j_trans_barrier);
>>> +    init_waitqueue_head(&journal->j_checkpointed);
>>> +    spin_lock_init(&journal->j_lock);
>>> +    journal->j_trans_id = 1UL;
>>> +    INIT_LIST_HEAD(&journal->j_la_cleanups);
>>> +    INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>>> +    journal->j_state = OCFS2_JOURNAL_FREE;
>>> +
>> We may fold these into a new function, such as ocfs2_journal_alloc().
>>
> 
> OK, will create a new function in journal.c
> For this area FIXME comment, the legacy comment humor & vague. In my view, we
> already got the order issue clearly, why not we change it as below?
>     /*
>      * FIXME
>      * This should be done in ocfs2_journal_init(), but any inode
>      * writes back operation will cause the filesystem to crash.
>      */

Sounds good.

Thanks,
Joseph

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
  2022-04-09 16:47     ` heming.zhao--- via Ocfs2-devel
@ 2022-04-11  1:56       ` Joseph Qi via Ocfs2-devel
  2022-04-11  2:09         ` heming.zhao--- via Ocfs2-devel
  0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-11  1:56 UTC (permalink / raw)
  To: heming.zhao, ocfs2-devel



On 4/10/22 12:47 AM, heming.zhao@suse.com wrote:
> On 4/9/22 21:30, Joseph Qi wrote:
>>
>>
>> On 4/8/22 6:30 PM, Heming Zhao wrote:
>>> After this patch, when error, ocfs2_fill_super doesn't take care to
>>> release resources which are allocated in ocfs2_initialize_super.
>>>
>>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>>> ---
>>>   fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
>>>   1 file changed, 40 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>>> index f91c5510bc7e..8443ba031dec 100644
>>> --- a/fs/ocfs2/super.c
>>> +++ b/fs/ocfs2/super.c
>>> @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>>       if (!osb) {
>>>           status = -ENOMEM;
>>>           mlog_errno(status);
>>> -        goto bail;
>>> +        goto out;
>>>       }
>>> ... ...
>>>   -bail:
>>> +    return status;
>>> +
>>> +out_slot_info:
>>> +    ocfs2_free_slot_info(osb);
>>> +out_system_inodes:
>>> +    ocfs2_release_system_inodes(osb);
>>> +out_dlm_out:
>>> +    ocfs2_put_dlm_debug(osb->osb_dlm_debug);
>>> +out_uuid_str:
>>> +    kfree(osb->uuid_str);
>>> +out_journal:
>>> +    kfree(osb->journal);
>>> +out_orphan_wipes:
>>> +    kfree(osb->osb_orphan_wipes);
>>> +out_slot_recovery_gen:
>>> +    kfree(osb->slot_recovery_generations);
>>> +out_vol_label:
>>> +    kfree(osb->vol_label);
>>> +out_recovery_map:
>>> +    kfree(osb->recovery_map);
>>> +out:
>>> +    kfree(osb);
>>
>> Should set osb to NULL here to prevent UAF in ocfs2_fill_super().
>>
> 
> Your concern only valid with patch 1/5+2/5+3/5, but after 5/5, the UAF won't be
> triggered. ocfs2_initialize_super() failure will directly jump (by "goto out")
> to return.
> 

Right, but we'd better make this patch look more complete.

Thanks,
Joseph

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
  2022-04-11  1:56       ` Joseph Qi via Ocfs2-devel
@ 2022-04-11  2:09         ` heming.zhao--- via Ocfs2-devel
  0 siblings, 0 replies; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-11  2:09 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On 4/11/22 09:56, Joseph Qi wrote:
> 
> 
> On 4/10/22 12:47 AM, heming.zhao@suse.com wrote:
>> On 4/9/22 21:30, Joseph Qi wrote:
>>>
>>>
>>> On 4/8/22 6:30 PM, Heming Zhao wrote:
>>>> After this patch, when error, ocfs2_fill_super doesn't take care to
>>>> release resources which are allocated in ocfs2_initialize_super.
>>>>
>>>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>>>> ---
>>>>    fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
>>>>    1 file changed, 40 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>>>> index f91c5510bc7e..8443ba031dec 100644
>>>> --- a/fs/ocfs2/super.c
>>>> +++ b/fs/ocfs2/super.c
>>>> @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>>>        if (!osb) {
>>>>            status = -ENOMEM;
>>>>            mlog_errno(status);
>>>> -        goto bail;
>>>> +        goto out;
>>>>        }
>>>> ... ...
>>>>    -bail:
>>>> +    return status;
>>>> +
>>>> +out_slot_info:
>>>> +    ocfs2_free_slot_info(osb);
>>>> +out_system_inodes:
>>>> +    ocfs2_release_system_inodes(osb);
>>>> +out_dlm_out:
>>>> +    ocfs2_put_dlm_debug(osb->osb_dlm_debug);
>>>> +out_uuid_str:
>>>> +    kfree(osb->uuid_str);
>>>> +out_journal:
>>>> +    kfree(osb->journal);
>>>> +out_orphan_wipes:
>>>> +    kfree(osb->osb_orphan_wipes);
>>>> +out_slot_recovery_gen:
>>>> +    kfree(osb->slot_recovery_generations);
>>>> +out_vol_label:
>>>> +    kfree(osb->vol_label);
>>>> +out_recovery_map:
>>>> +    kfree(osb->recovery_map);
>>>> +out:
>>>> +    kfree(osb);
>>>
>>> Should set osb to NULL here to prevent UAF in ocfs2_fill_super().
>>>
>>
>> Your concern only valid with patch 1/5+2/5+3/5, but after 5/5, the UAF won't be
>> triggered. ocfs2_initialize_super() failure will directly jump (by "goto out")
>> to return.
>>
> 
> Right, but we'd better make this patch look more complete.
> 

OK, I will make a complete patch on next verion.
And I found Thunderbird automatically adds '\n' in reply mail, which make the mail
format messy. I am sorry for this, and try to switch from Thunderbird to mutt.

Thanks,
Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
  2022-04-09 13:46   ` Joseph Qi via Ocfs2-devel
  2022-04-10  3:58     ` heming.zhao--- via Ocfs2-devel
@ 2022-04-12  8:22     ` Heming Zhao via Ocfs2-devel
  2022-04-12 11:54       ` Joseph Qi via Ocfs2-devel
  1 sibling, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-12  8:22 UTC (permalink / raw)
  To: Joseph Qi; +Cc: ocfs2-devel

My last reply (on 4.10) had messy format, because thunderbird automatically
add '\r' after '\n'. For easy review, I use neomutt to resend my replay.
Please check my comments in below.

On Sat, Apr 09, 2022 at 09:46:19PM +0800, Joseph Qi wrote:
> 
> 
> On 4/8/22 6:30 PM, Heming Zhao wrote:
> > After this patch, when error, ocfs2_fill_super doesn't take care to
> > release resources which are allocated in ocfs2_mount_volume.
> > 
> > Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> > ---
> >  fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
> >  1 file changed, 20 insertions(+), 12 deletions(-)
> > 
> > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> > index 8443ba031dec..d4b7a29cb364 100644
> > --- a/fs/ocfs2/super.c
> > +++ b/fs/ocfs2/super.c
> > @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
> >  static int ocfs2_mount_volume(struct super_block *sb)
> >  {
> >  	int status = 0;
> > -	int unlock_super = 0;
> >  	struct ocfs2_super *osb = OCFS2_SB(sb);
> >  
> >  	if (ocfs2_is_hard_readonly(osb))
> > -		goto leave;
> > +		goto out;
> >  
> >  	mutex_init(&osb->obs_trim_fs_mutex);
> >  
> > @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
> >  		if (status == -EBADR && ocfs2_userspace_stack(osb))
> >  			mlog(ML_ERROR, "couldn't mount because cluster name on"
> >  			" disk does not match the running cluster name.\n");
> > -		goto leave;
> > +		goto out;
> >  	}
> >  
> >  	status = ocfs2_super_lock(osb, 1);
> >  	if (status < 0) {
> >  		mlog_errno(status);
> > -		goto leave;
> > +		goto out_dlm;
> >  	}
> > -	unlock_super = 1;
> >  
> >  	/* This will load up the node map and add ourselves to it. */
> >  	status = ocfs2_find_slot(osb);
> >  	if (status < 0) {
> >  		mlog_errno(status);
> > -		goto leave;
> > +		goto out_super_lock;
> >  	}
> >  
> >  	/* load all node-local system inodes */
> >  	status = ocfs2_init_local_system_inodes(osb);
> >  	if (status < 0) {
> >  		mlog_errno(status);
> > -		goto leave;
> > +		goto out_super_lock;
> >  	}
> >  
> >  	status = ocfs2_check_volume(osb);
> >  	if (status < 0) {
> >  		mlog_errno(status);
> > -		goto leave;
> > +		goto out_system_inodes;
> >  	}
> >  
> >  	status = ocfs2_truncate_log_init(osb);
> > -	if (status < 0)
> > +	if (status < 0) {
> >  		mlog_errno(status);
> > +		goto out_journal_shutdown;
> 
> ocfs2_check_volume() not only does journal initialization,
> but also local alloc.
> 

I agree and already handled this case. Please read following explanation.

Call flow:

ocfs2_fill_super
 ocfs2_mount_volume
  + ocfs2_init_local_system_inodes //[1]
  + ocfs2_check_volume
     ocfs2_load_local_alloc //may fail

If ocfs2_load_local_alloc fails, only "//local_alloc:%04d" is bad.
And other alloced inodes (from [1]) still exist.

Also note, the sys inodes release function is ocfs2_release_system_inodes(),
which could handle both osb->global_system_inodes[] & osb->local_system_inodes[]
are NULL pointer case.

The cleanup job will be done in both places:
- ocfs2_mount_volume (patch 4/5)
- ocfs2_fill_super (patch 5/5)

Related code after patched 4/5:
(please check my comment start with "hm:")
```
static int ocfs2_mount_volume(struct super_block *sb)
{
	... ...
	
	status = ocfs2_check_volume(osb);
	if (status < 0) {
		mlog_errno(status);
		
		//hm:
		//need to release global/local sys inodes (by //ocfs2_release_system_inodes())
		//which are alloced by above [1].
		goto out_system_inodes;
	}
	
	status = ocfs2_truncate_log_init(osb);
	if (status < 0) {
		mlog_errno(status);
		
		//hm:
		//need to release journal: by ocfs2_journal_shutdown()
		//need to release global/local sys inodes: byocfs2_release_system_inodes()
		goto out_journal_shutdown;
	}
	
	ocfs2_super_unlock(osb, 1);
	return status; //<== hm: will change to 0 under next comment
	
out_journal_shutdown:
	ocfs2_journal_shutdown(osb); //hm: through to label "out_system_inodes".
out_system_inodes:
	ocfs2_release_system_inodes(osb);
out_super_lock:
	ocfs2_super_unlock(osb, 1);
out_dlm:
	ocfs2_dlm_shutdown(osb, 0);
out:
	return status;
}
``` 

> > +	}
> >  
> > -leave:
> > -	if (unlock_super)
> > -		ocfs2_super_unlock(osb, 1);
> > +	ocfs2_super_unlock(osb, 1);
> > +	return status;
> Explicitly return 0 may be better, which means this is the
> normal path.
> 

OK.

- Heming
> 
> >
> > +out_journal_shutdown:
> > +	ocfs2_journal_shutdown(osb);
> > +out_system_inodes:
> > +	ocfs2_release_system_inodes(osb);
> > +out_super_lock:
> > +	ocfs2_super_unlock(osb, 1);
> > +out_dlm:
> > +	ocfs2_dlm_shutdown(osb, 0);
> > +out:
> >  	return status;
> >  }
> >  
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
  2022-04-09 14:02   ` Joseph Qi via Ocfs2-devel
  2022-04-10 10:25     ` heming.zhao--- via Ocfs2-devel
@ 2022-04-12  8:49     ` Heming Zhao via Ocfs2-devel
  1 sibling, 0 replies; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-12  8:49 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

Resend my reply for easy review.

On Sat, Apr 09, 2022 at 10:02:16PM +0800, Joseph Qi wrote:
> 
> 
> On 4/8/22 6:30 PM, Heming Zhao wrote:
> > current ocfs2_fill_super error handling is very simple, and includes
> > kernel crash possibility.
> > 
> > e.g.
> > 
> > ocfs2_fill_super
> >   ocfs2_initialize_super // fails in check max_slots & return
> > 
> > Then osb->osb_mount_event is not properly initialized, and then will
> > crash at wake_up(&osb->osb_mount_event) in goto label read_super_error.
> > 
> With patch 3, this case won't exist any more.

Agree. After patch 3, ocfs2_initialize_super() will free osb before returning
failure, and this issue won't be triggered.
I will change the commit log in next version.

> 
> > For fixing the issue and enhancing error handling, we rewrite error
> > handling of ocfs2_fill_super.
> > 
> > Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> > ---
> >  fs/ocfs2/super.c | 67 ++++++++++++++++++++++++------------------------
> >  1 file changed, 33 insertions(+), 34 deletions(-)
> > 
> > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> > index d4b7a29cb364..c3efd9ad49ac 100644
> > --- a/fs/ocfs2/super.c
> > +++ b/fs/ocfs2/super.c
> > @@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  
> >  	if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
> >  		status = -EINVAL;
> > -		goto read_super_error;
> > +		goto out;
> >  	}
> >  
> >  	/* probe for superblock */
> >  	status = ocfs2_sb_probe(sb, &bh, &sector_size, &stats);
> >  	if (status < 0) {
> >  		mlog(ML_ERROR, "superblock probe failed!\n");
> > -		goto read_super_error;
> > +		goto out;
> >  	}
> >  
> >  	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
> > -	osb = OCFS2_SB(sb);
> > -	if (status < 0) {
> > -		mlog_errno(status);
> > -		goto read_super_error;
> > -	}
> >  	brelse(bh);
> >  	bh = NULL;
> > +	if (status < 0)
> > +		goto out;
> > +
> > +	osb = OCFS2_SB(sb);
> >  
> >  	if (!ocfs2_check_set_options(sb, &parsed_options)) {
> >  		status = -EINVAL;
> > -		goto read_super_error;
> > +		goto out_super;
> >  	}
> >  	osb->s_mount_opt = parsed_options.mount_opt;
> >  	osb->s_atime_quantum = parsed_options.atime_quantum;
> > @@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  
> >  	status = ocfs2_verify_userspace_stack(osb, &parsed_options);
> >  	if (status)
> > -		goto read_super_error;
> > +		goto out_super;
> >  
> >  	sb->s_magic = OCFS2_SUPER_MAGIC;
> >  
> > @@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  			status = -EACCES;
> >  			mlog(ML_ERROR, "Readonly device detected but readonly "
> >  			     "mount was not specified.\n");
> > -			goto read_super_error;
> > +			goto out_super;
> >  		}
> >  
> >  		/* You should not be able to start a local heartbeat
> > @@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  			status = -EROFS;
> >  			mlog(ML_ERROR, "Local heartbeat specified on readonly "
> >  			     "device.\n");
> > -			goto read_super_error;
> > +			goto out_super;
> >  		}
> >  
> >  		status = ocfs2_check_journals_nolocks(osb);
> > @@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  				mlog(ML_ERROR, "Recovery required on readonly "
> >  				     "file system, but write access is "
> >  				     "unavailable.\n");
> > -			else
> > -				mlog_errno(status);
> > -			goto read_super_error;
> > +			goto out_super;
> >  		}
> >  
> >  		ocfs2_set_ro_flag(osb, 1);
> > @@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  	}
> >  
> >  	status = ocfs2_verify_heartbeat(osb);
> > -	if (status < 0) {
> > -		mlog_errno(status);
> > -		goto read_super_error;
> > -	}
> > +	if (status < 0)
> > +		goto out_super;
> >  
> >  	osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
> >  						 ocfs2_debugfs_root);
> > @@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  
> >  	status = ocfs2_mount_volume(sb);
> >  	if (status < 0)
> > -		goto read_super_error;
> > +		goto out_debugfs;
> >  
> >  	if (osb->root_inode)
> >  		inode = igrab(osb->root_inode);
> >  
> >  	if (!inode) {
> >  		status = -EIO;
> > -		mlog_errno(status);
> > -		goto read_super_error;
> > +		goto out_journal;
> >  	}
> >  
> >  	osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
> > @@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  	if (!osb->osb_dev_kset) {
> >  		status = -ENOMEM;
> >  		mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
> > -		goto read_super_error;
> > +		goto out_journal;
> >  	}
> >  
> >  	/* Create filecheck sysfs related directories/files at
> > @@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  		status = -ENOMEM;
> >  		mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
> >  			"/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
> > -		goto read_super_error;
> > +		goto out_journal;
> >  	}
> >  
> >  	root = d_make_root(inode);
> >  	if (!root) {
> >  		status = -ENOMEM;
> > -		mlog_errno(status);
> > -		goto read_super_error;
> > +		goto out_journal;
> >  	}
> >  
> >  	sb->s_root = root;
> > @@ -1178,17 +1171,23 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  
> >  	return status;
> >  
> > -read_super_error:
> > -	brelse(bh);
> > +out_journal:
> 
> rename it to out_dismount?

agree.

> 
> > +	mlog_errno(status);
> > +	atomic_set(&osb->vol_state, VOLUME_DISABLED);
> > +	wake_up(&osb->osb_mount_event);
> > +	ocfs2_dismount_volume(sb, 1);
> >  
> > -	if (status)
> > -		mlog_errno(status);
> > +	return status;
> 
> Or remove mlog_errno() before and then just goto out?

Your mean: at the end of lable "out_journal", call "goto out"?
If yes, no problem for me.

> 
> >  
> > -	if (osb) {
> > -		atomic_set(&osb->vol_state, VOLUME_DISABLED);
> > -		wake_up(&osb->osb_mount_event);
> > -		ocfs2_dismount_volume(sb, 1);
> > -	}
> > +out_debugfs:
> > +	debugfs_remove_recursive(osb->osb_debug_root);
> > +out_super:
> > +	ocfs2_release_system_inodes(osb);
> > +	kfree(osb->recovery_map);
> > +	ocfs2_delete_osb(osb);
> > +	kfree(osb);
> 
> Seems we have to take care all resources in ocfs2_initialize_super()?

Yes, I checked every line.

for ocfs2_initialize_super(), I lists all my found resources:
(list in time order):

  osb, osb->recovery_map, osb->vol_label, osb->slot_recovery_generations,
  osb->osb_orphan_wipes, osb->journal, osb->uuid_str
  - all above have itselves out_xx label.
  
  osb->osb_dlm_debug & dlm sysfs stuffs.
  - release by out_dlm_out label
  
  system inodes (by ocfs2_init_global_system_inodes)
  - release by out_system_inodes label
  
  osb->slot_info, osb->slot_info->si_bh
  - release by out_slot_info label
  
  workqueue:osb->ocfs2_wq
  - will be freed in ocfs2_fill_super label "out_super": ocfs2_delete_osb().
    or be freed in ocfs2_dismount_volume => ocfs2_delete_osb(). 

> 
> > +out:
> > +	mlog_errno(status);
> >  
> As I described in your previous mail, ocfs2_sb_probe() may return error
> with non-NULL bh, so we should brelse here.
> 

I also explained why I dropped the brelse(bh), maybe anti-spam system
ate my mail.  (I found my emails also disappeared in other mailling lists.)

I re-explain here. Below code based on patch 5/5.  Please note <2> & <4>.

```
static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
{
	... ...

	if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
		status = -EINVAL;
		goto out;
	} 

	//hm: <1> bh is alloced/created from here.
	/* probe for superblock */
	status = ocfs2_sb_probe(sb, &bh, &sector_size, &stats);
	if (status < 0) {
		mlog(ML_ERROR, "superblock probe failed!\n");

		//hm: <2> if error, ocfs2_sb_probe() will release bh. So "out" label
		//        doesn't need to call brelse(bh).
		goto out;
	}

	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
	brelse(bh); //hm: <3> release bh before assessment status.
	bh = NULL;
	if (status < 0)
		goto out; //hm: <4> if error, bh is already released before "if()",
	              //        So "out" label does't need to call brelse(bh)

	osb = OCFS2_SB(sb);

	...  ...

out: //hm: no need to call brelse(bh)
	mlog_errno(status);
	return status;
}
```

Thanks,
Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
  2022-04-12  8:22     ` Heming Zhao via Ocfs2-devel
@ 2022-04-12 11:54       ` Joseph Qi via Ocfs2-devel
  2022-04-12 16:46         ` Heming Zhao via Ocfs2-devel
  0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-12 11:54 UTC (permalink / raw)
  To: Heming Zhao; +Cc: ocfs2-devel



On 4/12/22 4:22 PM, Heming Zhao wrote:
> My last reply (on 4.10) had messy format, because thunderbird automatically
> add '\r' after '\n'. For easy review, I use neomutt to resend my replay.
> Please check my comments in below.
> 
> On Sat, Apr 09, 2022 at 09:46:19PM +0800, Joseph Qi wrote:
>>
>>
>> On 4/8/22 6:30 PM, Heming Zhao wrote:
>>> After this patch, when error, ocfs2_fill_super doesn't take care to
>>> release resources which are allocated in ocfs2_mount_volume.
>>>
>>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>>> ---
>>>  fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
>>>  1 file changed, 20 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>>> index 8443ba031dec..d4b7a29cb364 100644
>>> --- a/fs/ocfs2/super.c
>>> +++ b/fs/ocfs2/super.c
>>> @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
>>>  static int ocfs2_mount_volume(struct super_block *sb)
>>>  {
>>>  	int status = 0;
>>> -	int unlock_super = 0;
>>>  	struct ocfs2_super *osb = OCFS2_SB(sb);
>>>  
>>>  	if (ocfs2_is_hard_readonly(osb))
>>> -		goto leave;
>>> +		goto out;
>>>  
>>>  	mutex_init(&osb->obs_trim_fs_mutex);
>>>  
>>> @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
>>>  		if (status == -EBADR && ocfs2_userspace_stack(osb))
>>>  			mlog(ML_ERROR, "couldn't mount because cluster name on"
>>>  			" disk does not match the running cluster name.\n");
>>> -		goto leave;
>>> +		goto out;
>>>  	}
>>>  
>>>  	status = ocfs2_super_lock(osb, 1);
>>>  	if (status < 0) {
>>>  		mlog_errno(status);
>>> -		goto leave;
>>> +		goto out_dlm;
>>>  	}
>>> -	unlock_super = 1;
>>>  
>>>  	/* This will load up the node map and add ourselves to it. */
>>>  	status = ocfs2_find_slot(osb);
>>>  	if (status < 0) {
>>>  		mlog_errno(status);
>>> -		goto leave;
>>> +		goto out_super_lock;
>>>  	}
>>>  
>>>  	/* load all node-local system inodes */
>>>  	status = ocfs2_init_local_system_inodes(osb);
>>>  	if (status < 0) {
>>>  		mlog_errno(status);
>>> -		goto leave;
>>> +		goto out_super_lock;
>>>  	}
>>>  
>>>  	status = ocfs2_check_volume(osb);
>>>  	if (status < 0) {
>>>  		mlog_errno(status);
>>> -		goto leave;
>>> +		goto out_system_inodes;
>>>  	}
>>>  
>>>  	status = ocfs2_truncate_log_init(osb);
>>> -	if (status < 0)
>>> +	if (status < 0) {
>>>  		mlog_errno(status);
>>> +		goto out_journal_shutdown;
>>
>> ocfs2_check_volume() not only does journal initialization,
>> but also local alloc.
>>
> 
> I agree and already handled this case. Please read following explanation.
> 
> Call flow:
> 
> ocfs2_fill_super
>  ocfs2_mount_volume
>   + ocfs2_init_local_system_inodes //[1]
>   + ocfs2_check_volume
>      ocfs2_load_local_alloc //may fail
> 
> If ocfs2_load_local_alloc fails, only "//local_alloc:%04d" is bad.
> And other alloced inodes (from [1]) still exist.
> 
> Also note, the sys inodes release function is ocfs2_release_system_inodes(),
> which could handle both osb->global_system_inodes[] & osb->local_system_inodes[]
> are NULL pointer case.
> 
> The cleanup job will be done in both places:
> - ocfs2_mount_volume (patch 4/5)
> - ocfs2_fill_super (patch 5/5)
> 
> Related code after patched 4/5:
> (please check my comment start with "hm:")
> ```
> static int ocfs2_mount_volume(struct super_block *sb)
> {
> 	... ...
> 	
> 	status = ocfs2_check_volume(osb);
> 	if (status < 0) {
> 		mlog_errno(status);
> 		
> 		//hm:
> 		//need to release global/local sys inodes (by //ocfs2_release_system_inodes())
> 		//which are alloced by above [1].
> 		goto out_system_inodes;
> 	}
> 	
> 	status = ocfs2_truncate_log_init(osb);
> 	if (status < 0) {
> 		mlog_errno(status);
> 		
> 		//hm:
> 		//need to release journal: by ocfs2_journal_shutdown()
> 		//need to release global/local sys inodes: byocfs2_release_system_inodes()
> 		goto out_journal_shutdown;
> 	}
> 	
> 	ocfs2_super_unlock(osb, 1);
> 	return status; //<== hm: will change to 0 under next comment
> 	
> out_journal_shutdown:
> 	ocfs2_journal_shutdown(osb); //hm: through to label "out_system_inodes".
> out_system_inodes:
> 	ocfs2_release_system_inodes(osb);
> out_super_lock:
> 	ocfs2_super_unlock(osb, 1);
> out_dlm:
> 	ocfs2_dlm_shutdown(osb, 0);
> out:
> 	return status;
> }
> ``` 
I mean we may need call ocfs2_shutdown_local_alloc() to do the cleanup.
In ocfs2_fill_super(), you don't do that.

Thanks,
Joseph


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
  2022-04-12 11:54       ` Joseph Qi via Ocfs2-devel
@ 2022-04-12 16:46         ` Heming Zhao via Ocfs2-devel
  0 siblings, 0 replies; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-12 16:46 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On Tue, Apr 12, 2022 at 07:54:35PM +0800, Joseph Qi wrote:
> 
> On 4/12/22 4:22 PM, Heming Zhao wrote:
> > My last reply (on 4.10) had messy format, because thunderbird automatically
> > add '\r' after '\n'. For easy review, I use neomutt to resend my replay.
> > Please check my comments in below.
> > 
> > On Sat, Apr 09, 2022 at 09:46:19PM +0800, Joseph Qi wrote:
> >>
> >>
> >> On 4/8/22 6:30 PM, Heming Zhao wrote:
> >>> After this patch, when error, ocfs2_fill_super doesn't take care to
> >>> release resources which are allocated in ocfs2_mount_volume.
> >>>
> >>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> >>> ---
> >>>  fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
> >>>  1 file changed, 20 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> >>> index 8443ba031dec..d4b7a29cb364 100644
> >>> --- a/fs/ocfs2/super.c
> >>> +++ b/fs/ocfs2/super.c
> >>> @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
> >>>  static int ocfs2_mount_volume(struct super_block *sb)
> >>>  {
> >>>  	int status = 0;
> >>> -	int unlock_super = 0;
> >>>  	struct ocfs2_super *osb = OCFS2_SB(sb);
> >>>  
> >>>  	if (ocfs2_is_hard_readonly(osb))
> >>> -		goto leave;
> >>> +		goto out;
> >>>  
> >>>  	mutex_init(&osb->obs_trim_fs_mutex);
> >>>  
> >>> @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
> >>>  		if (status == -EBADR && ocfs2_userspace_stack(osb))
> >>>  			mlog(ML_ERROR, "couldn't mount because cluster name on"
> >>>  			" disk does not match the running cluster name.\n");
> >>> -		goto leave;
> >>> +		goto out;
> >>>  	}
> >>>  
> >>>  	status = ocfs2_super_lock(osb, 1);
> >>>  	if (status < 0) {
> >>>  		mlog_errno(status);
> >>> -		goto leave;
> >>> +		goto out_dlm;
> >>>  	}
> >>> -	unlock_super = 1;
> >>>  
> >>>  	/* This will load up the node map and add ourselves to it. */
> >>>  	status = ocfs2_find_slot(osb);
> >>>  	if (status < 0) {
> >>>  		mlog_errno(status);
> >>> -		goto leave;
> >>> +		goto out_super_lock;
> >>>  	}
> >>>  
> >>>  	/* load all node-local system inodes */
> >>>  	status = ocfs2_init_local_system_inodes(osb);
> >>>  	if (status < 0) {
> >>>  		mlog_errno(status);
> >>> -		goto leave;
> >>> +		goto out_super_lock;
> >>>  	}
> >>>  
> >>>  	status = ocfs2_check_volume(osb);
> >>>  	if (status < 0) {
> >>>  		mlog_errno(status);
> >>> -		goto leave;
> >>> +		goto out_system_inodes;
> >>>  	}
> >>>  
> >>>  	status = ocfs2_truncate_log_init(osb);
> >>> -	if (status < 0)
> >>> +	if (status < 0) {
> >>>  		mlog_errno(status);
> >>> +		goto out_journal_shutdown;
> >>
> >> ocfs2_check_volume() not only does journal initialization,
> >> but also local alloc.
> >>
> > 
> > I agree and already handled this case. Please read following explanation.
> > 
> > Call flow:
> > 
> > ocfs2_fill_super
> >  ocfs2_mount_volume
> >   + ocfs2_init_local_system_inodes //[1]
> >   + ocfs2_check_volume
> >      ocfs2_load_local_alloc //may fail
> > 
> > If ocfs2_load_local_alloc fails, only "//local_alloc:%04d" is bad.
> > And other alloced inodes (from [1]) still exist.
> > 
> > Also note, the sys inodes release function is ocfs2_release_system_inodes(),
> > which could handle both osb->global_system_inodes[] & osb->local_system_inodes[]
> > are NULL pointer case.
> > 
> > The cleanup job will be done in both places:
> > - ocfs2_mount_volume (patch 4/5)
> > - ocfs2_fill_super (patch 5/5)
> > 
> > Related code after patched 4/5:
> > (please check my comment start with "hm:")
> > ```
> > static int ocfs2_mount_volume(struct super_block *sb)
> > {
> > 	... ...
> > 	
> > 	status = ocfs2_check_volume(osb);
> > 	if (status < 0) {
> > 		mlog_errno(status);
> > 		
> > 		//hm:
> > 		//need to release global/local sys inodes (by //ocfs2_release_system_inodes())
> > 		//which are alloced by above [1].
> > 		goto out_system_inodes;
> > 	}
> > 	
> > 	status = ocfs2_truncate_log_init(osb);
> > 	if (status < 0) {
> > 		mlog_errno(status);
> > 		
> > 		//hm:
> > 		//need to release journal: by ocfs2_journal_shutdown()
> > 		//need to release global/local sys inodes: byocfs2_release_system_inodes()
> > 		goto out_journal_shutdown;
> > 	}
> > 	
> > 	ocfs2_super_unlock(osb, 1);
> > 	return status; //<== hm: will change to 0 under next comment
> > 	
> > out_journal_shutdown:
> > 	ocfs2_journal_shutdown(osb); //hm: through to label "out_system_inodes".
> > out_system_inodes:
> > 	ocfs2_release_system_inodes(osb);
> > out_super_lock:
> > 	ocfs2_super_unlock(osb, 1);
> > out_dlm:
> > 	ocfs2_dlm_shutdown(osb, 0);
> > out:
> > 	return status;
> > }
> > ``` 
> I mean we may need call ocfs2_shutdown_local_alloc() to do the cleanup.
> In ocfs2_fill_super(), you don't do that.
> 
 
You are right, my mistake.

I also found another mistake in this patch: "goto out_journal_shutdown" is wrong.
After ocfs2_truncate_log_init() failure, code firstly calls ocfs2_journal_shutdown()
to clean journal then calls ocfs2_release_system_inodes(), it will trigger crash.

- Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2022-04-13 13:50 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
2022-04-09 13:11   ` Joseph Qi via Ocfs2-devel
2022-04-09 16:28     ` heming.zhao--- via Ocfs2-devel
2022-04-10 12:00       ` Joseph Qi via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
2022-04-09 13:14   ` Joseph Qi via Ocfs2-devel
2022-04-09 16:29     ` heming.zhao--- via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
2022-04-09 13:30   ` Joseph Qi via Ocfs2-devel
2022-04-09 16:47     ` heming.zhao--- via Ocfs2-devel
2022-04-11  1:56       ` Joseph Qi via Ocfs2-devel
2022-04-11  2:09         ` heming.zhao--- via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
2022-04-09 13:46   ` Joseph Qi via Ocfs2-devel
2022-04-10  3:58     ` heming.zhao--- via Ocfs2-devel
2022-04-12  8:22     ` Heming Zhao via Ocfs2-devel
2022-04-12 11:54       ` Joseph Qi via Ocfs2-devel
2022-04-12 16:46         ` Heming Zhao via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
2022-04-09 14:02   ` Joseph Qi via Ocfs2-devel
2022-04-10 10:25     ` heming.zhao--- via Ocfs2-devel
2022-04-12  8:49     ` Heming Zhao via Ocfs2-devel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.