ocfs2-devel.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH v2 0/5] rewrite error handling during mounting stage
@ 2022-04-13  8:29 Heming Zhao via Ocfs2-devel
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 1/5] ocfs2: fix mounting crash if journal is not alloced Heming Zhao via Ocfs2-devel
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-13  8:29 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

** v2 **

patch 1/5:
 - change patch subject.
 - add a new function ocfs2_journal_alloc
 - add new comment at front of ocfs2_journal_alloc
 - in ocfs2_initialize_super, change legacy comment before calling
   ocfs2_journal_alloc.

patch 2/5:
 - revise commit log according review comment.
 - remove meanless comment of ocfs2_resmap_init in header file.

patch 3/5:
 - set "sb->s_fs_info = NULL" (osb is NULL) when ocfs2_initialize_super
   fails.

patch 4/5:
 for ocfs2_mount_volume():
 - make it return 0 (before return "status") as successful.
 - remove goto label "out_journal_shutdown"
 - add more clean job in label "out_system_inodes"
   - add ocfs2_shutdown_local_alloc() to release local alloc resources.
   - add ocfs2_journal_shutdown() to clean up journal.

 for ocfs2_check_volume():
 - add new comment at the front of ocfs2_check_volume.
 - change comment style/format for "struct ocfs2_dinode *local_alloc"

patch 5/5:
 revise commit log.

 in ocfs2_fill_super():
 - change goto label "out_journal" to "out_dismount"
 - add "goto out" in label "out_dismount".

** draft -> v1 ** 

- split one patch into 5 patches.
- goto labal name change to out_xxx style.
- only test for mount/umount & 0001-xx.patch related issue.

Heming Zhao (5):
  ocfs2: fix mounting crash if journal is not alloced
  ocfs2: change return type of ocfs2_resmap_init
  ocfs2: ocfs2_initialize_super does cleanup job before return error
  ocfs2: ocfs2_mount_volume does cleanup job before return error
  ocfs2: rewrite error handling of ocfs2_fill_super

 fs/ocfs2/inode.c        |   4 +-
 fs/ocfs2/journal.c      |  32 ++++---
 fs/ocfs2/reservations.c |   4 +-
 fs/ocfs2/reservations.h |   9 +-
 fs/ocfs2/super.c        | 188 +++++++++++++++++++++++++---------------
 5 files changed, 143 insertions(+), 94 deletions(-)

-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [Ocfs2-devel] [PATCH v2 1/5] ocfs2: fix mounting crash if journal is not alloced
  2022-04-13  8:29 [Ocfs2-devel] [PATCH v2 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
@ 2022-04-13  8:29 ` Heming Zhao via Ocfs2-devel
  2022-04-13 10:54   ` Joseph Qi via Ocfs2-devel
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-13  8:29 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
journal init later than before, it makes NULL pointer access in free
routine.

Crash flow:

ocfs2_fill_super
 + ocfs2_mount_volume
 |  + ocfs2_dlm_init //fail & return, osb->journal is NULL.
 |  + ...
 |  + ocfs2_check_volume //no chance to init osb->journal
 |
 + ...
 + ocfs2_dismount_volume
    ocfs2_release_system_inodes
      ...
       evict
        ...
         ocfs2_clear_inode
          ocfs2_checkpoint_inode
           ocfs2_ci_fully_checkpointed
            time_after(journal->j_trans_id, ci->ci_last_trans)
             + journal is empty, crash!

For fixing, there are three solutions:

1> Partly revert commit da5e7c87827e8

For avoiding kernel crash, this make sense for us. We only concerned
whether there has any non-system inode access before dlm init. The
answer is NO. And all journal replay/recovery handling happen after
dlm & journal init done. So this method is not graceful but workable.

2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)

The fix code is special for mounting phase, but it will continue
working after mounting stage. In another word, this method adds useless
code in normal inode free flow.

3> Do directly free inode in mounting phase

This method is brutal/complex and may introduce unsafe code, currently
maintainer didn't like.

At last, we chose method <1> and did partly reverted job.
We reverted journal init codes, and kept cleanup codes flow.

Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/inode.c   |  4 ++--
 fs/ocfs2/journal.c | 32 ++++++++++++++++++++++----------
 fs/ocfs2/super.c   | 16 ++++++++++++++++
 3 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 5739dc301569..bb116c39b581 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 	struct inode *inode = NULL;
 	struct super_block *sb = osb->sb;
 	struct ocfs2_find_inode_args args;
+	journal_t *journal = osb->journal->j_journal;
 
 	trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
 			       sysfile_type);
@@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
 	 * part of the transaction - the inode could have been reclaimed and
 	 * now it is reread from disk.
 	 */
-	if (osb->journal) {
+	if (journal) {
 		transaction_t *transaction;
 		tid_t tid;
 		struct ocfs2_inode_info *oi = OCFS2_I(inode);
-		journal_t *journal = osb->journal->j_journal;
 
 		read_lock(&journal->j_state_lock);
 		if (journal->j_running_transaction)
diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 1887a2708709..49255fddce45 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -810,22 +810,20 @@ void ocfs2_set_journal_params(struct ocfs2_super *osb)
 	write_unlock(&journal->j_state_lock);
 }
 
-int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
+/*
+ * alloc & initialize skeleton for journal structure.
+ * ocfs2_journal_init() will make fs have journal ability.
+ */
+int ocfs2_journal_alloc(struct ocfs2_super *osb)
 {
-	int status = -1;
-	struct inode *inode = NULL; /* the journal inode */
-	journal_t *j_journal = NULL;
-	struct ocfs2_journal *journal = NULL;
-	struct ocfs2_dinode *di = NULL;
-	struct buffer_head *bh = NULL;
-	int inode_lock = 0;
+	int status = 0;
+	struct ocfs2_journal *journal;
 
-	/* initialize our journal structure */
 	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
 	if (!journal) {
 		mlog(ML_ERROR, "unable to alloc journal\n");
 		status = -ENOMEM;
-		goto done;
+		goto bail;
 	}
 	osb->journal = journal;
 	journal->j_osb = osb;
@@ -839,6 +837,20 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
 	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
 	journal->j_state = OCFS2_JOURNAL_FREE;
 
+bail:
+	return status;
+}
+
+int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
+{
+	int status = -1;
+	struct inode *inode = NULL; /* the journal inode */
+	journal_t *j_journal = NULL;
+	struct ocfs2_journal *journal = osb->journal;
+	struct ocfs2_dinode *di = NULL;
+	struct buffer_head *bh = NULL;
+	int inode_lock = 0;
+
 	/* already have the inode for our journal */
 	inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
 					    osb->slot_num);
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 477cdf94122e..babec2c9d638 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	int i, cbits, bbits;
 	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
 	struct inode *inode = NULL;
+	struct ocfs2_journal *journal;
 	struct ocfs2_super *osb;
 	u64 total_blocks;
 
@@ -2195,6 +2196,15 @@ static int ocfs2_initialize_super(struct super_block *sb,
 
 	get_random_bytes(&osb->s_next_generation, sizeof(u32));
 
+	/*
+	 * FIXME
+	 * This should be done in ocfs2_journal_init(), but any inode
+	 * writes back operation will cause the filesystem to crash.
+	 */
+	status = ocfs2_journal_alloc(osb);
+	if (status)
+		goto bail;
+
 	INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
 	init_llist_head(&osb->dquot_drop_list);
 
@@ -2483,6 +2493,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
 
 	kfree(osb->osb_orphan_wipes);
 	kfree(osb->slot_recovery_generations);
+	/* FIXME
+	 * This belongs in journal shutdown, but because we have to
+	 * allocate osb->journal at the middle of ocfs2_initialize_super(),
+	 * we free it here.
+	 */
+	kfree(osb->journal);
 	kfree(osb->local_alloc_copy);
 	kfree(osb->uuid_str);
 	kfree(osb->vol_label);
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Ocfs2-devel] [PATCH v2 2/5] ocfs2: change return type of ocfs2_resmap_init
  2022-04-13  8:29 [Ocfs2-devel] [PATCH v2 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 1/5] ocfs2: fix mounting crash if journal is not alloced Heming Zhao via Ocfs2-devel
@ 2022-04-13  8:29 ` Heming Zhao via Ocfs2-devel
  2022-04-13 10:56   ` Joseph Qi via Ocfs2-devel
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-13  8:29 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

Since ocfs2_resmap_init() always return 0, change it to void.

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/reservations.c | 4 +---
 fs/ocfs2/reservations.h | 9 ++-------
 fs/ocfs2/super.c        | 6 +-----
 3 files changed, 4 insertions(+), 15 deletions(-)

diff --git a/fs/ocfs2/reservations.c b/fs/ocfs2/reservations.c
index 769e466887b0..a9d1296d736d 100644
--- a/fs/ocfs2/reservations.c
+++ b/fs/ocfs2/reservations.c
@@ -198,7 +198,7 @@ void ocfs2_resv_set_type(struct ocfs2_alloc_reservation *resv,
 	resv->r_flags |= flags;
 }
 
-int ocfs2_resmap_init(struct ocfs2_super *osb,
+void ocfs2_resmap_init(struct ocfs2_super *osb,
 		      struct ocfs2_reservation_map *resmap)
 {
 	memset(resmap, 0, sizeof(*resmap));
@@ -207,8 +207,6 @@ int ocfs2_resmap_init(struct ocfs2_super *osb,
 	resmap->m_reservations = RB_ROOT;
 	/* m_bitmap_len is initialized to zero by the above memset. */
 	INIT_LIST_HEAD(&resmap->m_lru);
-
-	return 0;
 }
 
 static void ocfs2_resv_mark_lru(struct ocfs2_reservation_map *resmap,
diff --git a/fs/ocfs2/reservations.h b/fs/ocfs2/reservations.h
index 677c50663595..ec8101ef5717 100644
--- a/fs/ocfs2/reservations.h
+++ b/fs/ocfs2/reservations.h
@@ -73,15 +73,10 @@ void ocfs2_resv_discard(struct ocfs2_reservation_map *resmap,
 
 /**
  * ocfs2_resmap_init() - Initialize fields of a reservations bitmap
+ * @osb: struct ocfs2_super to be saved in resmap
  * @resmap: struct ocfs2_reservation_map to initialize
- * @obj: unused for now
- * @ops: unused for now
- * @max_bitmap_bytes: Maximum size of the bitmap (typically blocksize)
- *
- * Only possible return value other than '0' is -ENOMEM for failure to
- * allocation mirror bitmap.
  */
-int ocfs2_resmap_init(struct ocfs2_super *osb,
+void ocfs2_resmap_init(struct ocfs2_super *osb,
 		      struct ocfs2_reservation_map *resmap);
 
 /**
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index babec2c9d638..7076125f5b46 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2111,11 +2111,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 
 	init_waitqueue_head(&osb->osb_mount_event);
 
-	status = ocfs2_resmap_init(osb, &osb->osb_la_resmap);
-	if (status) {
-		mlog_errno(status);
-		goto bail;
-	}
+	ocfs2_resmap_init(osb, &osb->osb_la_resmap);
 
 	osb->vol_label = kmalloc(OCFS2_MAX_VOL_LABEL_LEN, GFP_KERNEL);
 	if (!osb->vol_label) {
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Ocfs2-devel] [PATCH v2 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
  2022-04-13  8:29 [Ocfs2-devel] [PATCH v2 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 1/5] ocfs2: fix mounting crash if journal is not alloced Heming Zhao via Ocfs2-devel
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
@ 2022-04-13  8:29 ` Heming Zhao via Ocfs2-devel
  2022-04-13 11:02   ` Joseph Qi via Ocfs2-devel
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
  4 siblings, 1 reply; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-13  8:29 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_initialize_super.

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/super.c | 59 +++++++++++++++++++++++++++++++++---------------
 1 file changed, 41 insertions(+), 18 deletions(-)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 7076125f5b46..4302c3e9598c 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb) {
 		status = -ENOMEM;
 		mlog_errno(status);
-		goto bail;
+		goto out;
 	}
 
 	sb->s_fs_info = osb;
@@ -2084,7 +2084,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		mlog(ML_ERROR, "Invalid number of node slots (%u)\n",
 		     osb->max_slots);
 		status = -EINVAL;
-		goto bail;
+		goto out;
 	}
 
 	ocfs2_orphan_scan_init(osb);
@@ -2093,7 +2093,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (status) {
 		mlog(ML_ERROR, "Unable to initialize recovery state\n");
 		mlog_errno(status);
-		goto bail;
+		goto out;
 	}
 
 	init_waitqueue_head(&osb->checkpoint_event);
@@ -2117,7 +2117,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb->vol_label) {
 		mlog(ML_ERROR, "unable to alloc vol label\n");
 		status = -ENOMEM;
-		goto bail;
+		goto out_recovery_map;
 	}
 
 	osb->slot_recovery_generations =
@@ -2126,7 +2126,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb->slot_recovery_generations) {
 		status = -ENOMEM;
 		mlog_errno(status);
-		goto bail;
+		goto out_vol_label;
 	}
 
 	init_waitqueue_head(&osb->osb_wipe_event);
@@ -2136,7 +2136,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb->osb_orphan_wipes) {
 		status = -ENOMEM;
 		mlog_errno(status);
-		goto bail;
+		goto out_slot_recovery_gen;
 	}
 
 	osb->osb_rf_lock_tree = RB_ROOT;
@@ -2152,13 +2152,13 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		mlog(ML_ERROR, "couldn't mount because of unsupported "
 		     "optional features (%x).\n", i);
 		status = -EINVAL;
-		goto bail;
+		goto out_orphan_wipes;
 	}
 	if (!sb_rdonly(osb->sb) && (i = OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb, ~OCFS2_FEATURE_RO_COMPAT_SUPP))) {
 		mlog(ML_ERROR, "couldn't mount RDWR because of "
 		     "unsupported optional features (%x).\n", i);
 		status = -EINVAL;
-		goto bail;
+		goto out_orphan_wipes;
 	}
 
 	if (ocfs2_clusterinfo_valid(osb)) {
@@ -2179,7 +2179,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 			     "cluster stack label (%s) \n",
 			     osb->osb_cluster_stack);
 			status = -EINVAL;
-			goto bail;
+			goto out_orphan_wipes;
 		}
 		memcpy(osb->osb_cluster_name,
 			OCFS2_RAW_SB(di)->s_cluster_info.ci_cluster,
@@ -2199,7 +2199,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	 */
 	status = ocfs2_journal_alloc(osb);
 	if (status)
-		goto bail;
+		goto out_orphan_wipes;
 
 	INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
 	init_llist_head(&osb->dquot_drop_list);
@@ -2214,7 +2214,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		mlog(ML_ERROR, "Volume has invalid cluster size (%d)\n",
 		     osb->s_clustersize);
 		status = -EINVAL;
-		goto bail;
+		goto out_journal;
 	}
 
 	total_blocks = ocfs2_clusters_to_blocks(osb->sb,
@@ -2226,14 +2226,14 @@ static int ocfs2_initialize_super(struct super_block *sb,
 		mlog(ML_ERROR, "Volume too large "
 		     "to mount safely on this system");
 		status = -EFBIG;
-		goto bail;
+		goto out_journal;
 	}
 
 	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
 				 sizeof(di->id2.i_super.s_uuid))) {
 		mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
 		status = -ENOMEM;
-		goto bail;
+		goto out_journal;
 	}
 
 	strlcpy(osb->vol_label, di->id2.i_super.s_label,
@@ -2253,7 +2253,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!osb->osb_dlm_debug) {
 		status = -ENOMEM;
 		mlog_errno(status);
-		goto bail;
+		goto out_uuid_str;
 	}
 
 	atomic_set(&osb->vol_state, VOLUME_INIT);
@@ -2262,7 +2262,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	status = ocfs2_init_global_system_inodes(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto bail;
+		goto out_dlm_out;
 	}
 
 	/*
@@ -2273,7 +2273,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	if (!inode) {
 		status = -EINVAL;
 		mlog_errno(status);
-		goto bail;
+		goto out_system_inodes;
 	}
 
 	osb->bitmap_blkno = OCFS2_I(inode)->ip_blkno;
@@ -2286,16 +2286,39 @@ static int ocfs2_initialize_super(struct super_block *sb,
 	status = ocfs2_init_slot_info(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto bail;
+		goto out_system_inodes;
 	}
 
 	osb->ocfs2_wq = alloc_ordered_workqueue("ocfs2_wq", WQ_MEM_RECLAIM);
 	if (!osb->ocfs2_wq) {
 		status = -ENOMEM;
 		mlog_errno(status);
+		goto out_slot_info;
 	}
 
-bail:
+	return status;
+
+out_slot_info:
+	ocfs2_free_slot_info(osb);
+out_system_inodes:
+	ocfs2_release_system_inodes(osb);
+out_dlm_out:
+	ocfs2_put_dlm_debug(osb->osb_dlm_debug);
+out_uuid_str:
+	kfree(osb->uuid_str);
+out_journal:
+	kfree(osb->journal);
+out_orphan_wipes:
+	kfree(osb->osb_orphan_wipes);
+out_slot_recovery_gen:
+	kfree(osb->slot_recovery_generations);
+out_vol_label:
+	kfree(osb->vol_label);
+out_recovery_map:
+	kfree(osb->recovery_map);
+out:
+	kfree(osb);
+	sb->s_fs_info = NULL;
 	return status;
 }
 
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Ocfs2-devel] [PATCH v2 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
  2022-04-13  8:29 [Ocfs2-devel] [PATCH v2 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
                   ` (2 preceding siblings ...)
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
@ 2022-04-13  8:29 ` Heming Zhao via Ocfs2-devel
  2022-04-13 11:25   ` Joseph Qi via Ocfs2-devel
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
  4 siblings, 1 reply; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-13  8:29 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_mount_volume.

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/super.c | 42 +++++++++++++++++++++++++++---------------
 1 file changed, 27 insertions(+), 15 deletions(-)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 4302c3e9598c..5e860d7162d7 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
 static int ocfs2_mount_volume(struct super_block *sb)
 {
 	int status = 0;
-	int unlock_super = 0;
 	struct ocfs2_super *osb = OCFS2_SB(sb);
 
 	if (ocfs2_is_hard_readonly(osb))
-		goto leave;
+		goto out;
 
 	mutex_init(&osb->obs_trim_fs_mutex);
 
@@ -1817,44 +1816,54 @@ static int ocfs2_mount_volume(struct super_block *sb)
 		if (status == -EBADR && ocfs2_userspace_stack(osb))
 			mlog(ML_ERROR, "couldn't mount because cluster name on"
 			" disk does not match the running cluster name.\n");
-		goto leave;
+		goto out;
 	}
 
 	status = ocfs2_super_lock(osb, 1);
 	if (status < 0) {
 		mlog_errno(status);
-		goto leave;
+		goto out_dlm;
 	}
-	unlock_super = 1;
 
 	/* This will load up the node map and add ourselves to it. */
 	status = ocfs2_find_slot(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto leave;
+		goto out_super_lock;
 	}
 
 	/* load all node-local system inodes */
 	status = ocfs2_init_local_system_inodes(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto leave;
+		goto out_super_lock;
 	}
 
 	status = ocfs2_check_volume(osb);
 	if (status < 0) {
 		mlog_errno(status);
-		goto leave;
+		goto out_system_inodes;
 	}
 
 	status = ocfs2_truncate_log_init(osb);
-	if (status < 0)
+	if (status < 0) {
 		mlog_errno(status);
+		goto out_system_inodes;
+	}
 
-leave:
-	if (unlock_super)
-		ocfs2_super_unlock(osb, 1);
+	ocfs2_super_unlock(osb, 1);
+	return 0;
 
+out_system_inodes:
+	if (osb->local_alloc_state == OCFS2_LA_ENABLED)
+		ocfs2_shutdown_local_alloc(osb);
+	ocfs2_release_system_inodes(osb);
+	ocfs2_journal_shutdown(osb);
+out_super_lock:
+	ocfs2_super_unlock(osb, 1);
+out_dlm:
+	ocfs2_dlm_shutdown(osb, 0);
+out:
 	return status;
 }
 
@@ -2393,14 +2402,17 @@ static int ocfs2_verify_volume(struct ocfs2_dinode *di,
 	return status;
 }
 
+/*
+ * If this function returns failure, caller responds to release
+ * here alloced resources.
+ */
 static int ocfs2_check_volume(struct ocfs2_super *osb)
 {
 	int status;
 	int dirty;
 	int local;
-	struct ocfs2_dinode *local_alloc = NULL; /* only used if we
-						  * recover
-						  * ourselves. */
+    /* only used if we recover ourselves. */
+	struct ocfs2_dinode *local_alloc = NULL;
 
 	/* Init our journal object. */
 	status = ocfs2_journal_init(osb, &dirty);
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [Ocfs2-devel] [PATCH v2 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
  2022-04-13  8:29 [Ocfs2-devel] [PATCH v2 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
                   ` (3 preceding siblings ...)
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
@ 2022-04-13  8:29 ` Heming Zhao via Ocfs2-devel
  2022-04-13 11:37   ` Joseph Qi via Ocfs2-devel
  4 siblings, 1 reply; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-13  8:29 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

Current ocfs2_fill_super() uses one goto label "read_super_error" to
handle all error cases. And with previous serial patches, the error
handling should fork more branches to handle different error cases.
This patch rewrite the error handling of ocfs2_fill_super.

Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
 fs/ocfs2/super.c | 67 +++++++++++++++++++++++-------------------------
 1 file changed, 32 insertions(+), 35 deletions(-)

diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 5e860d7162d7..72673d40d29c 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 
 	if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
 		status = -EINVAL;
-		goto read_super_error;
+		goto out;
 	}
 
 	/* probe for superblock */
 	status = ocfs2_sb_probe(sb, &bh, &sector_size, &stats);
 	if (status < 0) {
 		mlog(ML_ERROR, "superblock probe failed!\n");
-		goto read_super_error;
+		goto out;
 	}
 
 	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
-	osb = OCFS2_SB(sb);
-	if (status < 0) {
-		mlog_errno(status);
-		goto read_super_error;
-	}
 	brelse(bh);
 	bh = NULL;
+	if (status < 0)
+		goto out;
+
+	osb = OCFS2_SB(sb);
 
 	if (!ocfs2_check_set_options(sb, &parsed_options)) {
 		status = -EINVAL;
-		goto read_super_error;
+		goto out_super;
 	}
 	osb->s_mount_opt = parsed_options.mount_opt;
 	osb->s_atime_quantum = parsed_options.atime_quantum;
@@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 
 	status = ocfs2_verify_userspace_stack(osb, &parsed_options);
 	if (status)
-		goto read_super_error;
+		goto out_super;
 
 	sb->s_magic = OCFS2_SUPER_MAGIC;
 
@@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 			status = -EACCES;
 			mlog(ML_ERROR, "Readonly device detected but readonly "
 			     "mount was not specified.\n");
-			goto read_super_error;
+			goto out_super;
 		}
 
 		/* You should not be able to start a local heartbeat
@@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 			status = -EROFS;
 			mlog(ML_ERROR, "Local heartbeat specified on readonly "
 			     "device.\n");
-			goto read_super_error;
+			goto out_super;
 		}
 
 		status = ocfs2_check_journals_nolocks(osb);
@@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 				mlog(ML_ERROR, "Recovery required on readonly "
 				     "file system, but write access is "
 				     "unavailable.\n");
-			else
-				mlog_errno(status);
-			goto read_super_error;
+			goto out_super;
 		}
 
 		ocfs2_set_ro_flag(osb, 1);
@@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 	}
 
 	status = ocfs2_verify_heartbeat(osb);
-	if (status < 0) {
-		mlog_errno(status);
-		goto read_super_error;
-	}
+	if (status < 0)
+		goto out_super;
 
 	osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
 						 ocfs2_debugfs_root);
@@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 
 	status = ocfs2_mount_volume(sb);
 	if (status < 0)
-		goto read_super_error;
+		goto out_debugfs;
 
 	if (osb->root_inode)
 		inode = igrab(osb->root_inode);
 
 	if (!inode) {
 		status = -EIO;
-		mlog_errno(status);
-		goto read_super_error;
+		goto out_dismount;
 	}
 
 	osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
@@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 	if (!osb->osb_dev_kset) {
 		status = -ENOMEM;
 		mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
-		goto read_super_error;
+		goto out_dismount;
 	}
 
 	/* Create filecheck sysfs related directories/files at
@@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 		status = -ENOMEM;
 		mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
 			"/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
-		goto read_super_error;
+		goto out_dismount;
 	}
 
 	root = d_make_root(inode);
 	if (!root) {
 		status = -ENOMEM;
-		mlog_errno(status);
-		goto read_super_error;
+		goto out_dismount;
 	}
 
 	sb->s_root = root;
@@ -1178,17 +1171,21 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
 
 	return status;
 
-read_super_error:
-	brelse(bh);
-
-	if (status)
-		mlog_errno(status);
+out_dismount:
+	atomic_set(&osb->vol_state, VOLUME_DISABLED);
+	wake_up(&osb->osb_mount_event);
+	ocfs2_dismount_volume(sb, 1);
+	goto out;
 
-	if (osb) {
-		atomic_set(&osb->vol_state, VOLUME_DISABLED);
-		wake_up(&osb->osb_mount_event);
-		ocfs2_dismount_volume(sb, 1);
-	}
+out_debugfs:
+	debugfs_remove_recursive(osb->osb_debug_root);
+out_super:
+	ocfs2_release_system_inodes(osb);
+	kfree(osb->recovery_map);
+	ocfs2_delete_osb(osb);
+	kfree(osb);
+out:
+	mlog_errno(status);
 
 	return status;
 }
-- 
2.35.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 1/5] ocfs2: fix mounting crash if journal is not alloced
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 1/5] ocfs2: fix mounting crash if journal is not alloced Heming Zhao via Ocfs2-devel
@ 2022-04-13 10:54   ` Joseph Qi via Ocfs2-devel
  2022-04-13 13:18     ` heming.zhao--- via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-13 10:54 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel



On 4/13/22 4:29 PM, Heming Zhao wrote:
> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
> journal init later than before, it makes NULL pointer access in free
> routine.
> 
> Crash flow:
> 
> ocfs2_fill_super
>  + ocfs2_mount_volume
>  |  + ocfs2_dlm_init //fail & return, osb->journal is NULL.
>  |  + ...
>  |  + ocfs2_check_volume //no chance to init osb->journal
>  |
>  + ...
>  + ocfs2_dismount_volume
>     ocfs2_release_system_inodes
>       ...
>        evict
>         ...
>          ocfs2_clear_inode
>           ocfs2_checkpoint_inode
>            ocfs2_ci_fully_checkpointed
>             time_after(journal->j_trans_id, ci->ci_last_trans)
>              + journal is empty, crash!
> 
> For fixing, there are three solutions:
> 
> 1> Partly revert commit da5e7c87827e8
> 
> For avoiding kernel crash, this make sense for us. We only concerned
> whether there has any non-system inode access before dlm init. The
> answer is NO. And all journal replay/recovery handling happen after
> dlm & journal init done. So this method is not graceful but workable.
> 
> 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)
> 
> The fix code is special for mounting phase, but it will continue
> working after mounting stage. In another word, this method adds useless
> code in normal inode free flow.
> 
> 3> Do directly free inode in mounting phase
> 
> This method is brutal/complex and may introduce unsafe code, currently
> maintainer didn't like.
> 
> At last, we chose method <1> and did partly reverted job.
> We reverted journal init codes, and kept cleanup codes flow.
> 
> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
>  fs/ocfs2/inode.c   |  4 ++--
>  fs/ocfs2/journal.c | 32 ++++++++++++++++++++++----------
>  fs/ocfs2/super.c   | 16 ++++++++++++++++
>  3 files changed, 40 insertions(+), 12 deletions(-)
> 
> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> index 5739dc301569..bb116c39b581 100644
> --- a/fs/ocfs2/inode.c
> +++ b/fs/ocfs2/inode.c
> @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>  	struct inode *inode = NULL;
>  	struct super_block *sb = osb->sb;
>  	struct ocfs2_find_inode_args args;
> +	journal_t *journal = osb->journal->j_journal;
>  
>  	trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
>  			       sysfile_type);
> @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>  	 * part of the transaction - the inode could have been reclaimed and
>  	 * now it is reread from disk.
>  	 */
> -	if (osb->journal) {
> +	if (journal) {
>  		transaction_t *transaction;
>  		tid_t tid;
>  		struct ocfs2_inode_info *oi = OCFS2_I(inode);
> -		journal_t *journal = osb->journal->j_journal;
>  
>  		read_lock(&journal->j_state_lock);
>  		if (journal->j_running_transaction)
> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
> index 1887a2708709..49255fddce45 100644
> --- a/fs/ocfs2/journal.c
> +++ b/fs/ocfs2/journal.c
> @@ -810,22 +810,20 @@ void ocfs2_set_journal_params(struct ocfs2_super *osb)
>  	write_unlock(&journal->j_state_lock);
>  }
>  
> -int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
> +/*
> + * alloc & initialize skeleton for journal structure.
> + * ocfs2_journal_init() will make fs have journal ability.
> + */
> +int ocfs2_journal_alloc(struct ocfs2_super *osb)
>  {
> -	int status = -1;
> -	struct inode *inode = NULL; /* the journal inode */
> -	journal_t *j_journal = NULL;
> -	struct ocfs2_journal *journal = NULL;
> -	struct ocfs2_dinode *di = NULL;
> -	struct buffer_head *bh = NULL;
> -	int inode_lock = 0;
> +	int status = 0;
> +	struct ocfs2_journal *journal;
>  
> -	/* initialize our journal structure */
>  	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>  	if (!journal) {
>  		mlog(ML_ERROR, "unable to alloc journal\n");
>  		status = -ENOMEM;
> -		goto done;
> +		goto bail;
>  	}
>  	osb->journal = journal;
>  	journal->j_osb = osb;
> @@ -839,6 +837,20 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
>  	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>  	journal->j_state = OCFS2_JOURNAL_FREE;
>  
> +bail:
> +	return status;
> +}
> +
> +int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
> +{
> +	int status = -1;
> +	struct inode *inode = NULL; /* the journal inode */
> +	journal_t *j_journal = NULL;
> +	struct ocfs2_journal *journal = osb->journal;
> +	struct ocfs2_dinode *di = NULL;
> +	struct buffer_head *bh = NULL;
> +	int inode_lock = 0;
> +

Better to leave a BUG_ON(!journal) here like before.

>  	/* already have the inode for our journal */
>  	inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
>  					    osb->slot_num);
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 477cdf94122e..babec2c9d638 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	int i, cbits, bbits;
>  	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>  	struct inode *inode = NULL;
> +	struct ocfs2_journal *journal;
>  	struct ocfs2_super *osb;
>  	u64 total_blocks;
>  
> @@ -2195,6 +2196,15 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  
>  	get_random_bytes(&osb->s_next_generation, sizeof(u32));
>  
> +	/*
> +	 * FIXME
> +	 * This should be done in ocfs2_journal_init(), but any inode
> +	 * writes back operation will cause the filesystem to crash.
> +	 */
> +	status = ocfs2_journal_alloc(osb);
> +	if (status)

Explicitly mark 'status < 0'?

Thanks,
Joseph

> +		goto bail;
> +
>  	INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
>  	init_llist_head(&osb->dquot_drop_list);
>  
> @@ -2483,6 +2493,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
>  
>  	kfree(osb->osb_orphan_wipes);
>  	kfree(osb->slot_recovery_generations);
> +	/* FIXME
> +	 * This belongs in journal shutdown, but because we have to
> +	 * allocate osb->journal at the middle of ocfs2_initialize_super(),
> +	 * we free it here.
> +	 */
> +	kfree(osb->journal);
>  	kfree(osb->local_alloc_copy);
>  	kfree(osb->uuid_str);
>  	kfree(osb->vol_label);

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 2/5] ocfs2: change return type of ocfs2_resmap_init
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
@ 2022-04-13 10:56   ` Joseph Qi via Ocfs2-devel
  0 siblings, 0 replies; 15+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-13 10:56 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel



On 4/13/22 4:29 PM, Heming Zhao wrote:
> Since ocfs2_resmap_init() always return 0, change it to void.
> 
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>

Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
> ---
>  fs/ocfs2/reservations.c | 4 +---
>  fs/ocfs2/reservations.h | 9 ++-------
>  fs/ocfs2/super.c        | 6 +-----
>  3 files changed, 4 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/ocfs2/reservations.c b/fs/ocfs2/reservations.c
> index 769e466887b0..a9d1296d736d 100644
> --- a/fs/ocfs2/reservations.c
> +++ b/fs/ocfs2/reservations.c
> @@ -198,7 +198,7 @@ void ocfs2_resv_set_type(struct ocfs2_alloc_reservation *resv,
>  	resv->r_flags |= flags;
>  }
>  
> -int ocfs2_resmap_init(struct ocfs2_super *osb,
> +void ocfs2_resmap_init(struct ocfs2_super *osb,
>  		      struct ocfs2_reservation_map *resmap)
>  {
>  	memset(resmap, 0, sizeof(*resmap));
> @@ -207,8 +207,6 @@ int ocfs2_resmap_init(struct ocfs2_super *osb,
>  	resmap->m_reservations = RB_ROOT;
>  	/* m_bitmap_len is initialized to zero by the above memset. */
>  	INIT_LIST_HEAD(&resmap->m_lru);
> -
> -	return 0;
>  }
>  
>  static void ocfs2_resv_mark_lru(struct ocfs2_reservation_map *resmap,
> diff --git a/fs/ocfs2/reservations.h b/fs/ocfs2/reservations.h
> index 677c50663595..ec8101ef5717 100644
> --- a/fs/ocfs2/reservations.h
> +++ b/fs/ocfs2/reservations.h
> @@ -73,15 +73,10 @@ void ocfs2_resv_discard(struct ocfs2_reservation_map *resmap,
>  
>  /**
>   * ocfs2_resmap_init() - Initialize fields of a reservations bitmap
> + * @osb: struct ocfs2_super to be saved in resmap
>   * @resmap: struct ocfs2_reservation_map to initialize
> - * @obj: unused for now
> - * @ops: unused for now
> - * @max_bitmap_bytes: Maximum size of the bitmap (typically blocksize)
> - *
> - * Only possible return value other than '0' is -ENOMEM for failure to
> - * allocation mirror bitmap.
>   */
> -int ocfs2_resmap_init(struct ocfs2_super *osb,
> +void ocfs2_resmap_init(struct ocfs2_super *osb,
>  		      struct ocfs2_reservation_map *resmap);
>  
>  /**
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index babec2c9d638..7076125f5b46 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2111,11 +2111,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  
>  	init_waitqueue_head(&osb->osb_mount_event);
>  
> -	status = ocfs2_resmap_init(osb, &osb->osb_la_resmap);
> -	if (status) {
> -		mlog_errno(status);
> -		goto bail;
> -	}
> +	ocfs2_resmap_init(osb, &osb->osb_la_resmap);
>  
>  	osb->vol_label = kmalloc(OCFS2_MAX_VOL_LABEL_LEN, GFP_KERNEL);
>  	if (!osb->vol_label) {

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
@ 2022-04-13 11:02   ` Joseph Qi via Ocfs2-devel
  0 siblings, 0 replies; 15+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-13 11:02 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel



On 4/13/22 4:29 PM, Heming Zhao wrote:
> After this patch, when error, ocfs2_fill_super doesn't take care to
> release resources which are allocated in ocfs2_initialize_super.
> 
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>

Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
> ---
>  fs/ocfs2/super.c | 59 +++++++++++++++++++++++++++++++++---------------
>  1 file changed, 41 insertions(+), 18 deletions(-)
> 
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 7076125f5b46..4302c3e9598c 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out;
>  	}
>  
>  	sb->s_fs_info = osb;
> @@ -2084,7 +2084,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  		mlog(ML_ERROR, "Invalid number of node slots (%u)\n",
>  		     osb->max_slots);
>  		status = -EINVAL;
> -		goto bail;
> +		goto out;
>  	}
>  
>  	ocfs2_orphan_scan_init(osb);
> @@ -2093,7 +2093,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (status) {
>  		mlog(ML_ERROR, "Unable to initialize recovery state\n");
>  		mlog_errno(status);
> -		goto bail;
> +		goto out;
>  	}
>  
>  	init_waitqueue_head(&osb->checkpoint_event);
> @@ -2117,7 +2117,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb->vol_label) {
>  		mlog(ML_ERROR, "unable to alloc vol label\n");
>  		status = -ENOMEM;
> -		goto bail;
> +		goto out_recovery_map;
>  	}
>  
>  	osb->slot_recovery_generations =
> @@ -2126,7 +2126,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb->slot_recovery_generations) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_vol_label;
>  	}
>  
>  	init_waitqueue_head(&osb->osb_wipe_event);
> @@ -2136,7 +2136,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb->osb_orphan_wipes) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_slot_recovery_gen;
>  	}
>  
>  	osb->osb_rf_lock_tree = RB_ROOT;
> @@ -2152,13 +2152,13 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  		mlog(ML_ERROR, "couldn't mount because of unsupported "
>  		     "optional features (%x).\n", i);
>  		status = -EINVAL;
> -		goto bail;
> +		goto out_orphan_wipes;
>  	}
>  	if (!sb_rdonly(osb->sb) && (i = OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb, ~OCFS2_FEATURE_RO_COMPAT_SUPP))) {
>  		mlog(ML_ERROR, "couldn't mount RDWR because of "
>  		     "unsupported optional features (%x).\n", i);
>  		status = -EINVAL;
> -		goto bail;
> +		goto out_orphan_wipes;
>  	}
>  
>  	if (ocfs2_clusterinfo_valid(osb)) {
> @@ -2179,7 +2179,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  			     "cluster stack label (%s) \n",
>  			     osb->osb_cluster_stack);
>  			status = -EINVAL;
> -			goto bail;
> +			goto out_orphan_wipes;
>  		}
>  		memcpy(osb->osb_cluster_name,
>  			OCFS2_RAW_SB(di)->s_cluster_info.ci_cluster,
> @@ -2199,7 +2199,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	 */
>  	status = ocfs2_journal_alloc(osb);
>  	if (status)
> -		goto bail;
> +		goto out_orphan_wipes;
>  
>  	INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
>  	init_llist_head(&osb->dquot_drop_list);
> @@ -2214,7 +2214,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  		mlog(ML_ERROR, "Volume has invalid cluster size (%d)\n",
>  		     osb->s_clustersize);
>  		status = -EINVAL;
> -		goto bail;
> +		goto out_journal;
>  	}
>  
>  	total_blocks = ocfs2_clusters_to_blocks(osb->sb,
> @@ -2226,14 +2226,14 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  		mlog(ML_ERROR, "Volume too large "
>  		     "to mount safely on this system");
>  		status = -EFBIG;
> -		goto bail;
> +		goto out_journal;
>  	}
>  
>  	if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
>  				 sizeof(di->id2.i_super.s_uuid))) {
>  		mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
>  		status = -ENOMEM;
> -		goto bail;
> +		goto out_journal;
>  	}
>  
>  	strlcpy(osb->vol_label, di->id2.i_super.s_label,
> @@ -2253,7 +2253,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!osb->osb_dlm_debug) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_uuid_str;
>  	}
>  
>  	atomic_set(&osb->vol_state, VOLUME_INIT);
> @@ -2262,7 +2262,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	status = ocfs2_init_global_system_inodes(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_dlm_out;
>  	}
>  
>  	/*
> @@ -2273,7 +2273,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	if (!inode) {
>  		status = -EINVAL;
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_system_inodes;
>  	}
>  
>  	osb->bitmap_blkno = OCFS2_I(inode)->ip_blkno;
> @@ -2286,16 +2286,39 @@ static int ocfs2_initialize_super(struct super_block *sb,
>  	status = ocfs2_init_slot_info(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto bail;
> +		goto out_system_inodes;
>  	}
>  
>  	osb->ocfs2_wq = alloc_ordered_workqueue("ocfs2_wq", WQ_MEM_RECLAIM);
>  	if (!osb->ocfs2_wq) {
>  		status = -ENOMEM;
>  		mlog_errno(status);
> +		goto out_slot_info;
>  	}
>  
> -bail:
> +	return status;
> +
> +out_slot_info:
> +	ocfs2_free_slot_info(osb);
> +out_system_inodes:
> +	ocfs2_release_system_inodes(osb);
> +out_dlm_out:
> +	ocfs2_put_dlm_debug(osb->osb_dlm_debug);
> +out_uuid_str:
> +	kfree(osb->uuid_str);
> +out_journal:
> +	kfree(osb->journal);
> +out_orphan_wipes:
> +	kfree(osb->osb_orphan_wipes);
> +out_slot_recovery_gen:
> +	kfree(osb->slot_recovery_generations);
> +out_vol_label:
> +	kfree(osb->vol_label);
> +out_recovery_map:
> +	kfree(osb->recovery_map);
> +out:
> +	kfree(osb);
> +	sb->s_fs_info = NULL;
>  	return status;
>  }
>  

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
@ 2022-04-13 11:25   ` Joseph Qi via Ocfs2-devel
  2022-04-14  9:14     ` heming.zhao--- via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-13 11:25 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel



On 4/13/22 4:29 PM, Heming Zhao wrote:
> After this patch, when error, ocfs2_fill_super doesn't take care to
> release resources which are allocated in ocfs2_mount_volume.
> 
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
>  fs/ocfs2/super.c | 42 +++++++++++++++++++++++++++---------------
>  1 file changed, 27 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 4302c3e9598c..5e860d7162d7 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
>  static int ocfs2_mount_volume(struct super_block *sb)
>  {
>  	int status = 0;
> -	int unlock_super = 0;
>  	struct ocfs2_super *osb = OCFS2_SB(sb);
>  
>  	if (ocfs2_is_hard_readonly(osb))
> -		goto leave;
> +		goto out;
>  
>  	mutex_init(&osb->obs_trim_fs_mutex);
>  
> @@ -1817,44 +1816,54 @@ static int ocfs2_mount_volume(struct super_block *sb)
>  		if (status == -EBADR && ocfs2_userspace_stack(osb))
>  			mlog(ML_ERROR, "couldn't mount because cluster name on"
>  			" disk does not match the running cluster name.\n");
> -		goto leave;
> +		goto out;
>  	}
>  
>  	status = ocfs2_super_lock(osb, 1);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto leave;
> +		goto out_dlm;
>  	}
> -	unlock_super = 1;
>  
>  	/* This will load up the node map and add ourselves to it. */
>  	status = ocfs2_find_slot(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto leave;
> +		goto out_super_lock;
>  	}
>  
>  	/* load all node-local system inodes */
>  	status = ocfs2_init_local_system_inodes(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto leave;
> +		goto out_super_lock;
>  	}
>  
>  	status = ocfs2_check_volume(osb);
>  	if (status < 0) {
>  		mlog_errno(status);
> -		goto leave;
> +		goto out_system_inodes;
>  	}
>  
>  	status = ocfs2_truncate_log_init(osb);
> -	if (status < 0)
> +	if (status < 0) {
>  		mlog_errno(status);
> +		goto out_system_inodes;
> +	}
>  
> -leave:
> -	if (unlock_super)
> -		ocfs2_super_unlock(osb, 1);
> +	ocfs2_super_unlock(osb, 1);
> +	return 0;
>  
> +out_system_inodes:
> +	if (osb->local_alloc_state == OCFS2_LA_ENABLED)
> +		ocfs2_shutdown_local_alloc(osb);
> +	ocfs2_release_system_inodes(osb);
> +	ocfs2_journal_shutdown(osb);
> +out_super_lock:
> +	ocfs2_super_unlock(osb, 1);
> +out_dlm:
> +	ocfs2_dlm_shutdown(osb, 0);
> +out:
>  	return status;
>  }
>  
> @@ -2393,14 +2402,17 @@ static int ocfs2_verify_volume(struct ocfs2_dinode *di,
>  	return status;
>  }
>  
> +/*
> + * If this function returns failure, caller responds to release
> + * here alloced resources.
> + */
>  static int ocfs2_check_volume(struct ocfs2_super *osb)
>  {
>  	int status;
>  	int dirty;
>  	int local;
> -	struct ocfs2_dinode *local_alloc = NULL; /* only used if we
> -						  * recover
> -						  * ourselves. */
> +    /* only used if we recover ourselves. */

Malformed. Actually I don't think we have to touch this function.

Thanks,
Joseph

> +	struct ocfs2_dinode *local_alloc = NULL;
>  
>  	/* Init our journal object. */
>  	status = ocfs2_journal_init(osb, &dirty);

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
  2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
@ 2022-04-13 11:37   ` Joseph Qi via Ocfs2-devel
  2022-04-13 13:11     ` Heming Zhao via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-13 11:37 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel

It mostly looks good to me.
Have you done error injection test after this patchset applied?
I'm afraid ocfs2-test can't cover this as of now.

Thanks,
Joseph

On 4/13/22 4:29 PM, Heming Zhao wrote:
> Current ocfs2_fill_super() uses one goto label "read_super_error" to
> handle all error cases. And with previous serial patches, the error
> handling should fork more branches to handle different error cases.
> This patch rewrite the error handling of ocfs2_fill_super.
> 
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
>  fs/ocfs2/super.c | 67 +++++++++++++++++++++++-------------------------
>  1 file changed, 32 insertions(+), 35 deletions(-)
> 
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 5e860d7162d7..72673d40d29c 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
>  		status = -EINVAL;
> -		goto read_super_error;
> +		goto out;
>  	}
>  
>  	/* probe for superblock */
>  	status = ocfs2_sb_probe(sb, &bh, &sector_size, &stats);
>  	if (status < 0) {
>  		mlog(ML_ERROR, "superblock probe failed!\n");
> -		goto read_super_error;
> +		goto out;
>  	}
>  
>  	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
> -	osb = OCFS2_SB(sb);
> -	if (status < 0) {
> -		mlog_errno(status);
> -		goto read_super_error;
> -	}
>  	brelse(bh);
>  	bh = NULL;
> +	if (status < 0)
> +		goto out;
> +
> +	osb = OCFS2_SB(sb);
>  
>  	if (!ocfs2_check_set_options(sb, &parsed_options)) {
>  		status = -EINVAL;
> -		goto read_super_error;
> +		goto out_super;
>  	}
>  	osb->s_mount_opt = parsed_options.mount_opt;
>  	osb->s_atime_quantum = parsed_options.atime_quantum;
> @@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	status = ocfs2_verify_userspace_stack(osb, &parsed_options);
>  	if (status)
> -		goto read_super_error;
> +		goto out_super;
>  
>  	sb->s_magic = OCFS2_SUPER_MAGIC;
>  
> @@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  			status = -EACCES;
>  			mlog(ML_ERROR, "Readonly device detected but readonly "
>  			     "mount was not specified.\n");
> -			goto read_super_error;
> +			goto out_super;
>  		}
>  
>  		/* You should not be able to start a local heartbeat
> @@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  			status = -EROFS;
>  			mlog(ML_ERROR, "Local heartbeat specified on readonly "
>  			     "device.\n");
> -			goto read_super_error;
> +			goto out_super;
>  		}
>  
>  		status = ocfs2_check_journals_nolocks(osb);
> @@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  				mlog(ML_ERROR, "Recovery required on readonly "
>  				     "file system, but write access is "
>  				     "unavailable.\n");
> -			else
> -				mlog_errno(status);
> -			goto read_super_error;
> +			goto out_super;
>  		}
>  
>  		ocfs2_set_ro_flag(osb, 1);
> @@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  	}
>  
>  	status = ocfs2_verify_heartbeat(osb);
> -	if (status < 0) {
> -		mlog_errno(status);
> -		goto read_super_error;
> -	}
> +	if (status < 0)
> +		goto out_super;
>  
>  	osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
>  						 ocfs2_debugfs_root);
> @@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	status = ocfs2_mount_volume(sb);
>  	if (status < 0)
> -		goto read_super_error;
> +		goto out_debugfs;
>  
>  	if (osb->root_inode)
>  		inode = igrab(osb->root_inode);
>  
>  	if (!inode) {
>  		status = -EIO;
> -		mlog_errno(status);
> -		goto read_super_error;
> +		goto out_dismount;
>  	}
>  
>  	osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
> @@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  	if (!osb->osb_dev_kset) {
>  		status = -ENOMEM;
>  		mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
> -		goto read_super_error;
> +		goto out_dismount;
>  	}
>  
>  	/* Create filecheck sysfs related directories/files at
> @@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  		status = -ENOMEM;
>  		mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
>  			"/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
> -		goto read_super_error;
> +		goto out_dismount;
>  	}
>  
>  	root = d_make_root(inode);
>  	if (!root) {
>  		status = -ENOMEM;
> -		mlog_errno(status);
> -		goto read_super_error;
> +		goto out_dismount;
>  	}
>  
>  	sb->s_root = root;
> @@ -1178,17 +1171,21 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	return status;
>  
> -read_super_error:
> -	brelse(bh);
> -
> -	if (status)
> -		mlog_errno(status);
> +out_dismount:
> +	atomic_set(&osb->vol_state, VOLUME_DISABLED);
> +	wake_up(&osb->osb_mount_event);
> +	ocfs2_dismount_volume(sb, 1);
> +	goto out;
>  
> -	if (osb) {
> -		atomic_set(&osb->vol_state, VOLUME_DISABLED);
> -		wake_up(&osb->osb_mount_event);
> -		ocfs2_dismount_volume(sb, 1);
> -	}
> +out_debugfs:
> +	debugfs_remove_recursive(osb->osb_debug_root);
> +out_super:
> +	ocfs2_release_system_inodes(osb);
> +	kfree(osb->recovery_map);
> +	ocfs2_delete_osb(osb);
> +	kfree(osb);
> +out:
> +	mlog_errno(status);
>  
>  	return status;
>  }

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
  2022-04-13 11:37   ` Joseph Qi via Ocfs2-devel
@ 2022-04-13 13:11     ` Heming Zhao via Ocfs2-devel
  2022-04-14  1:47       ` Joseph Qi via Ocfs2-devel
  0 siblings, 1 reply; 15+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-13 13:11 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On Wed, Apr 13, 2022 at 07:37:59PM +0800, Joseph Qi wrote:
> It mostly looks good to me.
> Have you done error injection test after this patchset applied?
> I'm afraid ocfs2-test can't cover this as of now.
> 

I plan to write a debugfs api for error injection test.
The code only for these patches test

My idea:
echo N > /sys/kernel/debug/ocfs2/xxx/error
N: 1,2,3,4,...

"echo N" will make ocfs2 trigger Nth error point become true:

```
 	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
 	brelse(bh);
 	bh = NULL;
-	if (status < 0)
+	if (error_inject(n)) //eg. "echo 1 > xx" will trigger there "if()" becomes true.
 		goto out;
```

What's your opinion about my idea?
Or do you have other method to test?

- Heming

> Thanks,
> Joseph
> 
> On 4/13/22 4:29 PM, Heming Zhao wrote:
> > Current ocfs2_fill_super() uses one goto label "read_super_error" to
> > handle all error cases. And with previous serial patches, the error
> > handling should fork more branches to handle different error cases.
> > This patch rewrite the error handling of ocfs2_fill_super.
> > 
> > Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> > ---
> >  fs/ocfs2/super.c | 67 +++++++++++++++++++++++-------------------------
> >  1 file changed, 32 insertions(+), 35 deletions(-)
> > 
> > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> > index 5e860d7162d7..72673d40d29c 100644
> > --- a/fs/ocfs2/super.c
> > +++ b/fs/ocfs2/super.c
> > @@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  
> >  	if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
> >  		status = -EINVAL;
> > -		goto read_super_error;
> > +		goto out;
> >  	}
> >  
> >  	/* probe for superblock */
> >  	status = ocfs2_sb_probe(sb, &bh, &sector_size, &stats);
> >  	if (status < 0) {
> >  		mlog(ML_ERROR, "superblock probe failed!\n");
> > -		goto read_super_error;
> > +		goto out;
> >  	}
> >  
> >  	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
> > -	osb = OCFS2_SB(sb);
> > -	if (status < 0) {
> > -		mlog_errno(status);
> > -		goto read_super_error;
> > -	}
> >  	brelse(bh);
> >  	bh = NULL;
> > +	if (status < 0)
> > +		goto out;
> > +
> > +	osb = OCFS2_SB(sb);
> >  
> >  	if (!ocfs2_check_set_options(sb, &parsed_options)) {
> >  		status = -EINVAL;
> > -		goto read_super_error;
> > +		goto out_super;
> >  	}
> >  	osb->s_mount_opt = parsed_options.mount_opt;
> >  	osb->s_atime_quantum = parsed_options.atime_quantum;
> > @@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  
> >  	status = ocfs2_verify_userspace_stack(osb, &parsed_options);
> >  	if (status)
> > -		goto read_super_error;
> > +		goto out_super;
> >  
> >  	sb->s_magic = OCFS2_SUPER_MAGIC;
> >  
> > @@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  			status = -EACCES;
> >  			mlog(ML_ERROR, "Readonly device detected but readonly "
> >  			     "mount was not specified.\n");
> > -			goto read_super_error;
> > +			goto out_super;
> >  		}
> >  
> >  		/* You should not be able to start a local heartbeat
> > @@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  			status = -EROFS;
> >  			mlog(ML_ERROR, "Local heartbeat specified on readonly "
> >  			     "device.\n");
> > -			goto read_super_error;
> > +			goto out_super;
> >  		}
> >  
> >  		status = ocfs2_check_journals_nolocks(osb);
> > @@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  				mlog(ML_ERROR, "Recovery required on readonly "
> >  				     "file system, but write access is "
> >  				     "unavailable.\n");
> > -			else
> > -				mlog_errno(status);
> > -			goto read_super_error;
> > +			goto out_super;
> >  		}
> >  
> >  		ocfs2_set_ro_flag(osb, 1);
> > @@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  	}
> >  
> >  	status = ocfs2_verify_heartbeat(osb);
> > -	if (status < 0) {
> > -		mlog_errno(status);
> > -		goto read_super_error;
> > -	}
> > +	if (status < 0)
> > +		goto out_super;
> >  
> >  	osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
> >  						 ocfs2_debugfs_root);
> > @@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  
> >  	status = ocfs2_mount_volume(sb);
> >  	if (status < 0)
> > -		goto read_super_error;
> > +		goto out_debugfs;
> >  
> >  	if (osb->root_inode)
> >  		inode = igrab(osb->root_inode);
> >  
> >  	if (!inode) {
> >  		status = -EIO;
> > -		mlog_errno(status);
> > -		goto read_super_error;
> > +		goto out_dismount;
> >  	}
> >  
> >  	osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
> > @@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  	if (!osb->osb_dev_kset) {
> >  		status = -ENOMEM;
> >  		mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
> > -		goto read_super_error;
> > +		goto out_dismount;
> >  	}
> >  
> >  	/* Create filecheck sysfs related directories/files at
> > @@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  		status = -ENOMEM;
> >  		mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
> >  			"/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
> > -		goto read_super_error;
> > +		goto out_dismount;
> >  	}
> >  
> >  	root = d_make_root(inode);
> >  	if (!root) {
> >  		status = -ENOMEM;
> > -		mlog_errno(status);
> > -		goto read_super_error;
> > +		goto out_dismount;
> >  	}
> >  
> >  	sb->s_root = root;
> > @@ -1178,17 +1171,21 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >  
> >  	return status;
> >  
> > -read_super_error:
> > -	brelse(bh);
> > -
> > -	if (status)
> > -		mlog_errno(status);
> > +out_dismount:
> > +	atomic_set(&osb->vol_state, VOLUME_DISABLED);
> > +	wake_up(&osb->osb_mount_event);
> > +	ocfs2_dismount_volume(sb, 1);
> > +	goto out;
> >  
> > -	if (osb) {
> > -		atomic_set(&osb->vol_state, VOLUME_DISABLED);
> > -		wake_up(&osb->osb_mount_event);
> > -		ocfs2_dismount_volume(sb, 1);
> > -	}
> > +out_debugfs:
> > +	debugfs_remove_recursive(osb->osb_debug_root);
> > +out_super:
> > +	ocfs2_release_system_inodes(osb);
> > +	kfree(osb->recovery_map);
> > +	ocfs2_delete_osb(osb);
> > +	kfree(osb);
> > +out:
> > +	mlog_errno(status);
> >  
> >  	return status;
> >  }
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 1/5] ocfs2: fix mounting crash if journal is not alloced
  2022-04-13 10:54   ` Joseph Qi via Ocfs2-devel
@ 2022-04-13 13:18     ` heming.zhao--- via Ocfs2-devel
  0 siblings, 0 replies; 15+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-13 13:18 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On 4/13/22 18:54, Joseph Qi wrote:
> 
> 
> On 4/13/22 4:29 PM, Heming Zhao wrote:
>> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
>> journal init later than before, it makes NULL pointer access in free
>> routine.
>>
>> Crash flow:
>>
>> ocfs2_fill_super
>>   + ocfs2_mount_volume
>>   |  + ocfs2_dlm_init //fail & return, osb->journal is NULL.
>>   |  + ...
>>   |  + ocfs2_check_volume //no chance to init osb->journal
>>   |
>>   + ...
>>   + ocfs2_dismount_volume
>>      ocfs2_release_system_inodes
>>        ...
>>         evict
>>          ...
>>           ocfs2_clear_inode
>>            ocfs2_checkpoint_inode
>>             ocfs2_ci_fully_checkpointed
>>              time_after(journal->j_trans_id, ci->ci_last_trans)
>>               + journal is empty, crash!
>>
>> For fixing, there are three solutions:
>>
>> 1> Partly revert commit da5e7c87827e8
>>
>> For avoiding kernel crash, this make sense for us. We only concerned
>> whether there has any non-system inode access before dlm init. The
>> answer is NO. And all journal replay/recovery handling happen after
>> dlm & journal init done. So this method is not graceful but workable.
>>
>> 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)
>>
>> The fix code is special for mounting phase, but it will continue
>> working after mounting stage. In another word, this method adds useless
>> code in normal inode free flow.
>>
>> 3> Do directly free inode in mounting phase
>>
>> This method is brutal/complex and may introduce unsafe code, currently
>> maintainer didn't like.
>>
>> At last, we chose method <1> and did partly reverted job.
>> We reverted journal init codes, and kept cleanup codes flow.
>>
>> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>>   fs/ocfs2/inode.c   |  4 ++--
>>   fs/ocfs2/journal.c | 32 ++++++++++++++++++++++----------
>>   fs/ocfs2/super.c   | 16 ++++++++++++++++
>>   3 files changed, 40 insertions(+), 12 deletions(-)
>>
>> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
>> index 5739dc301569..bb116c39b581 100644
>> --- a/fs/ocfs2/inode.c
>> +++ b/fs/ocfs2/inode.c
>> @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>>   	struct inode *inode = NULL;
>>   	struct super_block *sb = osb->sb;
>>   	struct ocfs2_find_inode_args args;
>> +	journal_t *journal = osb->journal->j_journal;
>>   
>>   	trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
>>   			       sysfile_type);
>> @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>>   	 * part of the transaction - the inode could have been reclaimed and
>>   	 * now it is reread from disk.
>>   	 */
>> -	if (osb->journal) {
>> +	if (journal) {
>>   		transaction_t *transaction;
>>   		tid_t tid;
>>   		struct ocfs2_inode_info *oi = OCFS2_I(inode);
>> -		journal_t *journal = osb->journal->j_journal;
>>   
>>   		read_lock(&journal->j_state_lock);
>>   		if (journal->j_running_transaction)
>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>> index 1887a2708709..49255fddce45 100644
>> --- a/fs/ocfs2/journal.c
>> +++ b/fs/ocfs2/journal.c
>> @@ -810,22 +810,20 @@ void ocfs2_set_journal_params(struct ocfs2_super *osb)
>>   	write_unlock(&journal->j_state_lock);
>>   }
>>   
>> -int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
>> +/*
>> + * alloc & initialize skeleton for journal structure.
>> + * ocfs2_journal_init() will make fs have journal ability.
>> + */
>> +int ocfs2_journal_alloc(struct ocfs2_super *osb)
>>   {
>> -	int status = -1;
>> -	struct inode *inode = NULL; /* the journal inode */
>> -	journal_t *j_journal = NULL;
>> -	struct ocfs2_journal *journal = NULL;
>> -	struct ocfs2_dinode *di = NULL;
>> -	struct buffer_head *bh = NULL;
>> -	int inode_lock = 0;
>> +	int status = 0;
>> +	struct ocfs2_journal *journal;
>>   
>> -	/* initialize our journal structure */
>>   	journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>>   	if (!journal) {
>>   		mlog(ML_ERROR, "unable to alloc journal\n");
>>   		status = -ENOMEM;
>> -		goto done;
>> +		goto bail;
>>   	}
>>   	osb->journal = journal;
>>   	journal->j_osb = osb;
>> @@ -839,6 +837,20 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
>>   	INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>>   	journal->j_state = OCFS2_JOURNAL_FREE;
>>   
>> +bail:
>> +	return status;
>> +}
>> +
>> +int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
>> +{
>> +	int status = -1;
>> +	struct inode *inode = NULL; /* the journal inode */
>> +	journal_t *j_journal = NULL;
>> +	struct ocfs2_journal *journal = osb->journal;
>> +	struct ocfs2_dinode *di = NULL;
>> +	struct buffer_head *bh = NULL;
>> +	int inode_lock = 0;
>> +
> 
> Better to leave a BUG_ON(!journal) here like before.

OK. restore BUG_ON().
The reason I removed it: In theory it's absolutely impossible become NULL.
> 
>>   	/* already have the inode for our journal */
>>   	inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
>>   					    osb->slot_num);
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index 477cdf94122e..babec2c9d638 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>   	int i, cbits, bbits;
>>   	struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>   	struct inode *inode = NULL;
>> +	struct ocfs2_journal *journal;
>>   	struct ocfs2_super *osb;
>>   	u64 total_blocks;
>>   
>> @@ -2195,6 +2196,15 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>   
>>   	get_random_bytes(&osb->s_next_generation, sizeof(u32));
>>   
>> +	/*
>> +	 * FIXME
>> +	 * This should be done in ocfs2_journal_init(), but any inode
>> +	 * writes back operation will cause the filesystem to crash.
>> +	 */
>> +	status = ocfs2_journal_alloc(osb);
>> +	if (status)
> 
> Explicitly mark 'status < 0'?
> 

Sorry, my stupid mistake. I didn't compile & test v2 patch, and plan to test before v3.
For test, please check my comment in patch 5/5

- heming

> 
>> +		goto bail;
>> +
>>   	INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
>>   	init_llist_head(&osb->dquot_drop_list);
>>   
>> @@ -2483,6 +2493,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
>>   
>>   	kfree(osb->osb_orphan_wipes);
>>   	kfree(osb->slot_recovery_generations);
>> +	/* FIXME
>> +	 * This belongs in journal shutdown, but because we have to
>> +	 * allocate osb->journal at the middle of ocfs2_initialize_super(),
>> +	 * we free it here.
>> +	 */
>> +	kfree(osb->journal);
>>   	kfree(osb->local_alloc_copy);
>>   	kfree(osb->uuid_str);
>>   	kfree(osb->vol_label);
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
  2022-04-13 13:11     ` Heming Zhao via Ocfs2-devel
@ 2022-04-14  1:47       ` Joseph Qi via Ocfs2-devel
  0 siblings, 0 replies; 15+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-14  1:47 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel



On 4/13/22 9:11 PM, Heming Zhao wrote:
> On Wed, Apr 13, 2022 at 07:37:59PM +0800, Joseph Qi wrote:
>> It mostly looks good to me.
>> Have you done error injection test after this patchset applied?
>> I'm afraid ocfs2-test can't cover this as of now.
>>
> 
> I plan to write a debugfs api for error injection test.
> The code only for these patches test
> 
> My idea:
> echo N > /sys/kernel/debug/ocfs2/xxx/error
> N: 1,2,3,4,...
> 
> "echo N" will make ocfs2 trigger Nth error point become true:
> 
> ```
>  	status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
>  	brelse(bh);
>  	bh = NULL;
> -	if (status < 0)
> +	if (error_inject(n)) //eg. "echo 1 > xx" will trigger there "if()" becomes true.
>  		goto out;
> ```
> 
> What's your opinion about my idea?
> Or do you have other method to test?

Good idea, actually we did it in this way before since existing
fault-injection can only inject common failures such as slab allocation.

Thanks,
Joseph

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
  2022-04-13 11:25   ` Joseph Qi via Ocfs2-devel
@ 2022-04-14  9:14     ` heming.zhao--- via Ocfs2-devel
  0 siblings, 0 replies; 15+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-14  9:14 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On 4/13/22 19:25, Joseph Qi wrote:
> 
> 
> On 4/13/22 4:29 PM, Heming Zhao wrote:
>> After this patch, when error, ocfs2_fill_super doesn't take care to
>> release resources which are allocated in ocfs2_mount_volume.
>>
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>>   fs/ocfs2/super.c | 42 +++++++++++++++++++++++++++---------------
>>   1 file changed, 27 insertions(+), 15 deletions(-)
>> ... ...
>> -leave:
>> -	if (unlock_super)
>> -		ocfs2_super_unlock(osb, 1);
>> +	ocfs2_super_unlock(osb, 1);
>> +	return 0;
>>   
>> +out_system_inodes:
>> +	if (osb->local_alloc_state == OCFS2_LA_ENABLED)
>> +		ocfs2_shutdown_local_alloc(osb);
>> +	ocfs2_release_system_inodes(osb);
>> +	ocfs2_journal_shutdown(osb);
>> +out_super_lock:
>> +	ocfs2_super_unlock(osb, 1);
>> +out_dlm:
>> +	ocfs2_dlm_shutdown(osb, 0);
>> +out:
>>   	return status;
>>   }
>>   
>> @@ -2393,14 +2402,17 @@ static int ocfs2_verify_volume(struct ocfs2_dinode *di,
>>   	return status;
>>   }
>>   
>> +/*
>> + * If this function returns failure, caller responds to release
>> + * here alloced resources.
>> + */
>>   static int ocfs2_check_volume(struct ocfs2_super *osb)
>>   {
>>   	int status;
>>   	int dirty;
>>   	int local;
>> -	struct ocfs2_dinode *local_alloc = NULL; /* only used if we
>> -						  * recover
>> -						  * ourselves. */
>> +    /* only used if we recover ourselves. */
> 
> Malformed. Actually I don't think we have to touch this function.
> 

OK. Will roll back my changes.
My original intention is to add comment at front of this function to info caller.
To remind them to don't forget to release the alloced resources when fails.

- Heming


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2022-04-14 11:47 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-13  8:29 [Ocfs2-devel] [PATCH v2 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 1/5] ocfs2: fix mounting crash if journal is not alloced Heming Zhao via Ocfs2-devel
2022-04-13 10:54   ` Joseph Qi via Ocfs2-devel
2022-04-13 13:18     ` heming.zhao--- via Ocfs2-devel
2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
2022-04-13 10:56   ` Joseph Qi via Ocfs2-devel
2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
2022-04-13 11:02   ` Joseph Qi via Ocfs2-devel
2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
2022-04-13 11:25   ` Joseph Qi via Ocfs2-devel
2022-04-14  9:14     ` heming.zhao--- via Ocfs2-devel
2022-04-13  8:29 ` [Ocfs2-devel] [PATCH v2 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
2022-04-13 11:37   ` Joseph Qi via Ocfs2-devel
2022-04-13 13:11     ` Heming Zhao via Ocfs2-devel
2022-04-14  1:47       ` Joseph Qi via Ocfs2-devel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).