* [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage
@ 2022-04-08 10:30 Heming Zhao via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
` (4 more replies)
0 siblings, 5 replies; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
To: ocfs2-devel, joseph.qi
draft -> v1:
- split one patch into 5 patches.
- goto labal name change to out_xxx style.
- only test for mount/umount & 0001-xx.patch related issue.
- polish some codes
note:
I didn't totally revert da5e7c87827e8, just revert init part codes.
Heming Zhao (5):
ocfs2: partly revert da5e7c87827e8 for mounting crash issue
ocfs2: change return type of ocfs2_resmap_init
ocfs2: ocfs2_initialize_super does cleanup job before return error
ocfs2: ocfs2_mount_volume does cleanup job before return error
ocfs2: rewrite error handling of ocfs2_fill_super
fs/ocfs2/inode.c | 4 +-
fs/ocfs2/journal.c | 19 ----
fs/ocfs2/reservations.c | 4 +-
fs/ocfs2/reservations.h | 5 +-
fs/ocfs2/super.c | 195 ++++++++++++++++++++++++++--------------
5 files changed, 132 insertions(+), 95 deletions(-)
--
2.35.1
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue
2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
2022-04-09 13:11 ` Joseph Qi via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
` (3 subsequent siblings)
4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
To: ocfs2-devel, joseph.qi
After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
journal init later than before, it makes NULL pointer access in free
routine.
Crash flow:
ocfs2_fill_super
+ ocfs2_mount_volume
| + ocfs2_dlm_init //fail & return, osb->journal is NULL.
| + ...
| + ocfs2_check_volume //no chance to init osb->journal
|
+ ...
+ ocfs2_dismount_volume
ocfs2_release_system_inodes
...
evict
...
ocfs2_clear_inode
ocfs2_checkpoint_inode
ocfs2_ci_fully_checkpointed
time_after(journal->j_trans_id, ci->ci_last_trans)
+ journal is empty, crash!
For fixing, there are three solutions:
1> Revert commit da5e7c87827e8
For avoiding kernel crash, this make sense for us. We only concerned
whether there has any non-system inode access before dlm init. The
answer is NO. And all journal replay/recovery handling happen after
dlm & journal init done. So this method is not graceful but workable.
2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)
The fix code is special for mounting phase, but it will continue
working after mounting stage. In another word, this method adds useless
code in normal inode free flow.
3> Do directly free inode in mounting phase
This method is brutal/complex and may introduce unsafe code, currently
maintainer didn't like.
At last, we chose method <1> and partly reverted commit da5e7c87827e8.
We reverted journal init code, and kept cleanup code flow.
Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
fs/ocfs2/inode.c | 4 ++--
fs/ocfs2/journal.c | 21 +--------------------
fs/ocfs2/super.c | 33 +++++++++++++++++++++++++++++++++
3 files changed, 36 insertions(+), 22 deletions(-)
diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 5739dc301569..bb116c39b581 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
struct inode *inode = NULL;
struct super_block *sb = osb->sb;
struct ocfs2_find_inode_args args;
+ journal_t *journal = osb->journal->j_journal;
trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
sysfile_type);
@@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
* part of the transaction - the inode could have been reclaimed and
* now it is reread from disk.
*/
- if (osb->journal) {
+ if (journal) {
transaction_t *transaction;
tid_t tid;
struct ocfs2_inode_info *oi = OCFS2_I(inode);
- journal_t *journal = osb->journal->j_journal;
read_lock(&journal->j_state_lock);
if (journal->j_running_transaction)
diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
index 1887a2708709..afb85de3bb60 100644
--- a/fs/ocfs2/journal.c
+++ b/fs/ocfs2/journal.c
@@ -815,30 +815,11 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
int status = -1;
struct inode *inode = NULL; /* the journal inode */
journal_t *j_journal = NULL;
- struct ocfs2_journal *journal = NULL;
+ struct ocfs2_journal *journal = osb->journal;
struct ocfs2_dinode *di = NULL;
struct buffer_head *bh = NULL;
int inode_lock = 0;
- /* initialize our journal structure */
- journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
- if (!journal) {
- mlog(ML_ERROR, "unable to alloc journal\n");
- status = -ENOMEM;
- goto done;
- }
- osb->journal = journal;
- journal->j_osb = osb;
-
- atomic_set(&journal->j_num_trans, 0);
- init_rwsem(&journal->j_trans_barrier);
- init_waitqueue_head(&journal->j_checkpointed);
- spin_lock_init(&journal->j_lock);
- journal->j_trans_id = 1UL;
- INIT_LIST_HEAD(&journal->j_la_cleanups);
- INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
- journal->j_state = OCFS2_JOURNAL_FREE;
-
/* already have the inode for our journal */
inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
osb->slot_num);
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 477cdf94122e..5f63a2333e52 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
int i, cbits, bbits;
struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
struct inode *inode = NULL;
+ struct ocfs2_journal *journal;
struct ocfs2_super *osb;
u64 total_blocks;
@@ -2195,6 +2196,32 @@ static int ocfs2_initialize_super(struct super_block *sb,
get_random_bytes(&osb->s_next_generation, sizeof(u32));
+ /* FIXME
+ * This should be done in ocfs2_journal_init(), but unknown
+ * ordering issues will cause the filesystem to crash.
+ * If anyone wants to figure out what part of the code
+ * refers to osb->journal before ocfs2_journal_init() is run,
+ * be my guest.
+ */
+ /* initialize our journal structure */
+ journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
+ if (!journal) {
+ mlog(ML_ERROR, "unable to alloc journal\n");
+ status = -ENOMEM;
+ goto bail;
+ }
+ osb->journal = journal;
+ journal->j_osb = osb;
+
+ atomic_set(&journal->j_num_trans, 0);
+ init_rwsem(&journal->j_trans_barrier);
+ init_waitqueue_head(&journal->j_checkpointed);
+ spin_lock_init(&journal->j_lock);
+ journal->j_trans_id = 1UL;
+ INIT_LIST_HEAD(&journal->j_la_cleanups);
+ INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
+ journal->j_state = OCFS2_JOURNAL_FREE;
+
INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
init_llist_head(&osb->dquot_drop_list);
@@ -2483,6 +2510,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
kfree(osb->osb_orphan_wipes);
kfree(osb->slot_recovery_generations);
+ /* FIXME
+ * This belongs in journal shutdown, but because we have to
+ * allocate osb->journal at the middle of ocfs2_initialize_super(),
+ * we free it here.
+ */
+ kfree(osb->journal);
kfree(osb->local_alloc_copy);
kfree(osb->uuid_str);
kfree(osb->vol_label);
--
2.35.1
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init
2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
2022-04-09 13:14 ` Joseph Qi via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
` (2 subsequent siblings)
4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
To: ocfs2-devel, joseph.qi
For rewrite error handling of ocfs2_initialize_super, change this
function return type from int to void.
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
fs/ocfs2/reservations.c | 4 +---
fs/ocfs2/reservations.h | 5 ++---
fs/ocfs2/super.c | 6 +-----
3 files changed, 4 insertions(+), 11 deletions(-)
diff --git a/fs/ocfs2/reservations.c b/fs/ocfs2/reservations.c
index 769e466887b0..a9d1296d736d 100644
--- a/fs/ocfs2/reservations.c
+++ b/fs/ocfs2/reservations.c
@@ -198,7 +198,7 @@ void ocfs2_resv_set_type(struct ocfs2_alloc_reservation *resv,
resv->r_flags |= flags;
}
-int ocfs2_resmap_init(struct ocfs2_super *osb,
+void ocfs2_resmap_init(struct ocfs2_super *osb,
struct ocfs2_reservation_map *resmap)
{
memset(resmap, 0, sizeof(*resmap));
@@ -207,8 +207,6 @@ int ocfs2_resmap_init(struct ocfs2_super *osb,
resmap->m_reservations = RB_ROOT;
/* m_bitmap_len is initialized to zero by the above memset. */
INIT_LIST_HEAD(&resmap->m_lru);
-
- return 0;
}
static void ocfs2_resv_mark_lru(struct ocfs2_reservation_map *resmap,
diff --git a/fs/ocfs2/reservations.h b/fs/ocfs2/reservations.h
index 677c50663595..532d42eb6db7 100644
--- a/fs/ocfs2/reservations.h
+++ b/fs/ocfs2/reservations.h
@@ -78,10 +78,9 @@ void ocfs2_resv_discard(struct ocfs2_reservation_map *resmap,
* @ops: unused for now
* @max_bitmap_bytes: Maximum size of the bitmap (typically blocksize)
*
- * Only possible return value other than '0' is -ENOMEM for failure to
- * allocation mirror bitmap.
+ * This function does initialize resmap job.
*/
-int ocfs2_resmap_init(struct ocfs2_super *osb,
+void ocfs2_resmap_init(struct ocfs2_super *osb,
struct ocfs2_reservation_map *resmap);
/**
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 8391592817f8..f91c5510bc7e 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2111,11 +2111,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
init_waitqueue_head(&osb->osb_mount_event);
- status = ocfs2_resmap_init(osb, &osb->osb_la_resmap);
- if (status) {
- mlog_errno(status);
- goto bail;
- }
+ ocfs2_resmap_init(osb, &osb->osb_la_resmap);
osb->vol_label = kmalloc(OCFS2_MAX_VOL_LABEL_LEN, GFP_KERNEL);
if (!osb->vol_label) {
--
2.35.1
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
2022-04-09 13:30 ` Joseph Qi via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
To: ocfs2-devel, joseph.qi
After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_initialize_super.
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
1 file changed, 40 insertions(+), 18 deletions(-)
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index f91c5510bc7e..8443ba031dec 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
if (!osb) {
status = -ENOMEM;
mlog_errno(status);
- goto bail;
+ goto out;
}
sb->s_fs_info = osb;
@@ -2084,7 +2084,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
mlog(ML_ERROR, "Invalid number of node slots (%u)\n",
osb->max_slots);
status = -EINVAL;
- goto bail;
+ goto out;
}
ocfs2_orphan_scan_init(osb);
@@ -2093,7 +2093,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
if (status) {
mlog(ML_ERROR, "Unable to initialize recovery state\n");
mlog_errno(status);
- goto bail;
+ goto out;
}
init_waitqueue_head(&osb->checkpoint_event);
@@ -2117,7 +2117,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
if (!osb->vol_label) {
mlog(ML_ERROR, "unable to alloc vol label\n");
status = -ENOMEM;
- goto bail;
+ goto out_recovery_map;
}
osb->slot_recovery_generations =
@@ -2126,7 +2126,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
if (!osb->slot_recovery_generations) {
status = -ENOMEM;
mlog_errno(status);
- goto bail;
+ goto out_vol_label;
}
init_waitqueue_head(&osb->osb_wipe_event);
@@ -2136,7 +2136,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
if (!osb->osb_orphan_wipes) {
status = -ENOMEM;
mlog_errno(status);
- goto bail;
+ goto out_slot_recovery_gen;
}
osb->osb_rf_lock_tree = RB_ROOT;
@@ -2152,13 +2152,13 @@ static int ocfs2_initialize_super(struct super_block *sb,
mlog(ML_ERROR, "couldn't mount because of unsupported "
"optional features (%x).\n", i);
status = -EINVAL;
- goto bail;
+ goto out_orphan_wipes;
}
if (!sb_rdonly(osb->sb) && (i = OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb, ~OCFS2_FEATURE_RO_COMPAT_SUPP))) {
mlog(ML_ERROR, "couldn't mount RDWR because of "
"unsupported optional features (%x).\n", i);
status = -EINVAL;
- goto bail;
+ goto out_orphan_wipes;
}
if (ocfs2_clusterinfo_valid(osb)) {
@@ -2179,7 +2179,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
"cluster stack label (%s) \n",
osb->osb_cluster_stack);
status = -EINVAL;
- goto bail;
+ goto out_orphan_wipes;
}
memcpy(osb->osb_cluster_name,
OCFS2_RAW_SB(di)->s_cluster_info.ci_cluster,
@@ -2204,7 +2204,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
if (!journal) {
mlog(ML_ERROR, "unable to alloc journal\n");
status = -ENOMEM;
- goto bail;
+ goto out_orphan_wipes;
}
osb->journal = journal;
journal->j_osb = osb;
@@ -2231,7 +2231,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
mlog(ML_ERROR, "Volume has invalid cluster size (%d)\n",
osb->s_clustersize);
status = -EINVAL;
- goto bail;
+ goto out_journal;
}
total_blocks = ocfs2_clusters_to_blocks(osb->sb,
@@ -2243,14 +2243,14 @@ static int ocfs2_initialize_super(struct super_block *sb,
mlog(ML_ERROR, "Volume too large "
"to mount safely on this system");
status = -EFBIG;
- goto bail;
+ goto out_journal;
}
if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
sizeof(di->id2.i_super.s_uuid))) {
mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
status = -ENOMEM;
- goto bail;
+ goto out_journal;
}
strlcpy(osb->vol_label, di->id2.i_super.s_label,
@@ -2270,7 +2270,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
if (!osb->osb_dlm_debug) {
status = -ENOMEM;
mlog_errno(status);
- goto bail;
+ goto out_uuid_str;
}
atomic_set(&osb->vol_state, VOLUME_INIT);
@@ -2279,7 +2279,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
status = ocfs2_init_global_system_inodes(osb);
if (status < 0) {
mlog_errno(status);
- goto bail;
+ goto out_dlm_out;
}
/*
@@ -2290,7 +2290,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
if (!inode) {
status = -EINVAL;
mlog_errno(status);
- goto bail;
+ goto out_system_inodes;
}
osb->bitmap_blkno = OCFS2_I(inode)->ip_blkno;
@@ -2303,16 +2303,38 @@ static int ocfs2_initialize_super(struct super_block *sb,
status = ocfs2_init_slot_info(osb);
if (status < 0) {
mlog_errno(status);
- goto bail;
+ goto out_system_inodes;
}
osb->ocfs2_wq = alloc_ordered_workqueue("ocfs2_wq", WQ_MEM_RECLAIM);
if (!osb->ocfs2_wq) {
status = -ENOMEM;
mlog_errno(status);
+ goto out_slot_info;
}
-bail:
+ return status;
+
+out_slot_info:
+ ocfs2_free_slot_info(osb);
+out_system_inodes:
+ ocfs2_release_system_inodes(osb);
+out_dlm_out:
+ ocfs2_put_dlm_debug(osb->osb_dlm_debug);
+out_uuid_str:
+ kfree(osb->uuid_str);
+out_journal:
+ kfree(osb->journal);
+out_orphan_wipes:
+ kfree(osb->osb_orphan_wipes);
+out_slot_recovery_gen:
+ kfree(osb->slot_recovery_generations);
+out_vol_label:
+ kfree(osb->vol_label);
+out_recovery_map:
+ kfree(osb->recovery_map);
+out:
+ kfree(osb);
return status;
}
--
2.35.1
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
` (2 preceding siblings ...)
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
2022-04-09 13:46 ` Joseph Qi via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
To: ocfs2-devel, joseph.qi
After this patch, when error, ocfs2_fill_super doesn't take care to
release resources which are allocated in ocfs2_mount_volume.
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
1 file changed, 20 insertions(+), 12 deletions(-)
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 8443ba031dec..d4b7a29cb364 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
static int ocfs2_mount_volume(struct super_block *sb)
{
int status = 0;
- int unlock_super = 0;
struct ocfs2_super *osb = OCFS2_SB(sb);
if (ocfs2_is_hard_readonly(osb))
- goto leave;
+ goto out;
mutex_init(&osb->obs_trim_fs_mutex);
@@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
if (status == -EBADR && ocfs2_userspace_stack(osb))
mlog(ML_ERROR, "couldn't mount because cluster name on"
" disk does not match the running cluster name.\n");
- goto leave;
+ goto out;
}
status = ocfs2_super_lock(osb, 1);
if (status < 0) {
mlog_errno(status);
- goto leave;
+ goto out_dlm;
}
- unlock_super = 1;
/* This will load up the node map and add ourselves to it. */
status = ocfs2_find_slot(osb);
if (status < 0) {
mlog_errno(status);
- goto leave;
+ goto out_super_lock;
}
/* load all node-local system inodes */
status = ocfs2_init_local_system_inodes(osb);
if (status < 0) {
mlog_errno(status);
- goto leave;
+ goto out_super_lock;
}
status = ocfs2_check_volume(osb);
if (status < 0) {
mlog_errno(status);
- goto leave;
+ goto out_system_inodes;
}
status = ocfs2_truncate_log_init(osb);
- if (status < 0)
+ if (status < 0) {
mlog_errno(status);
+ goto out_journal_shutdown;
+ }
-leave:
- if (unlock_super)
- ocfs2_super_unlock(osb, 1);
+ ocfs2_super_unlock(osb, 1);
+ return status;
+out_journal_shutdown:
+ ocfs2_journal_shutdown(osb);
+out_system_inodes:
+ ocfs2_release_system_inodes(osb);
+out_super_lock:
+ ocfs2_super_unlock(osb, 1);
+out_dlm:
+ ocfs2_dlm_shutdown(osb, 0);
+out:
return status;
}
--
2.35.1
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
` (3 preceding siblings ...)
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
@ 2022-04-08 10:30 ` Heming Zhao via Ocfs2-devel
2022-04-09 14:02 ` Joseph Qi via Ocfs2-devel
4 siblings, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-08 10:30 UTC (permalink / raw)
To: ocfs2-devel, joseph.qi
current ocfs2_fill_super error handling is very simple, and includes
kernel crash possibility.
e.g.
ocfs2_fill_super
ocfs2_initialize_super // fails in check max_slots & return
Then osb->osb_mount_event is not properly initialized, and then will
crash at wake_up(&osb->osb_mount_event) in goto label read_super_error.
For fixing the issue and enhancing error handling, we rewrite error
handling of ocfs2_fill_super.
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
fs/ocfs2/super.c | 67 ++++++++++++++++++++++++------------------------
1 file changed, 33 insertions(+), 34 deletions(-)
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index d4b7a29cb364..c3efd9ad49ac 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
status = -EINVAL;
- goto read_super_error;
+ goto out;
}
/* probe for superblock */
status = ocfs2_sb_probe(sb, &bh, §or_size, &stats);
if (status < 0) {
mlog(ML_ERROR, "superblock probe failed!\n");
- goto read_super_error;
+ goto out;
}
status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
- osb = OCFS2_SB(sb);
- if (status < 0) {
- mlog_errno(status);
- goto read_super_error;
- }
brelse(bh);
bh = NULL;
+ if (status < 0)
+ goto out;
+
+ osb = OCFS2_SB(sb);
if (!ocfs2_check_set_options(sb, &parsed_options)) {
status = -EINVAL;
- goto read_super_error;
+ goto out_super;
}
osb->s_mount_opt = parsed_options.mount_opt;
osb->s_atime_quantum = parsed_options.atime_quantum;
@@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
status = ocfs2_verify_userspace_stack(osb, &parsed_options);
if (status)
- goto read_super_error;
+ goto out_super;
sb->s_magic = OCFS2_SUPER_MAGIC;
@@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
status = -EACCES;
mlog(ML_ERROR, "Readonly device detected but readonly "
"mount was not specified.\n");
- goto read_super_error;
+ goto out_super;
}
/* You should not be able to start a local heartbeat
@@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
status = -EROFS;
mlog(ML_ERROR, "Local heartbeat specified on readonly "
"device.\n");
- goto read_super_error;
+ goto out_super;
}
status = ocfs2_check_journals_nolocks(osb);
@@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
mlog(ML_ERROR, "Recovery required on readonly "
"file system, but write access is "
"unavailable.\n");
- else
- mlog_errno(status);
- goto read_super_error;
+ goto out_super;
}
ocfs2_set_ro_flag(osb, 1);
@@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
}
status = ocfs2_verify_heartbeat(osb);
- if (status < 0) {
- mlog_errno(status);
- goto read_super_error;
- }
+ if (status < 0)
+ goto out_super;
osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
ocfs2_debugfs_root);
@@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
status = ocfs2_mount_volume(sb);
if (status < 0)
- goto read_super_error;
+ goto out_debugfs;
if (osb->root_inode)
inode = igrab(osb->root_inode);
if (!inode) {
status = -EIO;
- mlog_errno(status);
- goto read_super_error;
+ goto out_journal;
}
osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
@@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
if (!osb->osb_dev_kset) {
status = -ENOMEM;
mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
- goto read_super_error;
+ goto out_journal;
}
/* Create filecheck sysfs related directories/files at
@@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
status = -ENOMEM;
mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
"/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
- goto read_super_error;
+ goto out_journal;
}
root = d_make_root(inode);
if (!root) {
status = -ENOMEM;
- mlog_errno(status);
- goto read_super_error;
+ goto out_journal;
}
sb->s_root = root;
@@ -1178,17 +1171,23 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
return status;
-read_super_error:
- brelse(bh);
+out_journal:
+ mlog_errno(status);
+ atomic_set(&osb->vol_state, VOLUME_DISABLED);
+ wake_up(&osb->osb_mount_event);
+ ocfs2_dismount_volume(sb, 1);
- if (status)
- mlog_errno(status);
+ return status;
- if (osb) {
- atomic_set(&osb->vol_state, VOLUME_DISABLED);
- wake_up(&osb->osb_mount_event);
- ocfs2_dismount_volume(sb, 1);
- }
+out_debugfs:
+ debugfs_remove_recursive(osb->osb_debug_root);
+out_super:
+ ocfs2_release_system_inodes(osb);
+ kfree(osb->recovery_map);
+ ocfs2_delete_osb(osb);
+ kfree(osb);
+out:
+ mlog_errno(status);
return status;
}
--
2.35.1
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
@ 2022-04-09 13:11 ` Joseph Qi via Ocfs2-devel
2022-04-09 16:28 ` heming.zhao--- via Ocfs2-devel
0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 13:11 UTC (permalink / raw)
To: Heming Zhao, ocfs2-devel
Suggest rename the subject, something like
ocfs2: fix mount crash if journal is not alloced
On 4/8/22 6:30 PM, Heming Zhao wrote:
> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
> journal init later than before, it makes NULL pointer access in free
> routine.
>
> Crash flow:
>
> ocfs2_fill_super
> + ocfs2_mount_volume
> | + ocfs2_dlm_init //fail & return, osb->journal is NULL.
> | + ...
> | + ocfs2_check_volume //no chance to init osb->journal
> |
> + ...
> + ocfs2_dismount_volume
> ocfs2_release_system_inodes
> ...
> evict
> ...
> ocfs2_clear_inode
> ocfs2_checkpoint_inode
> ocfs2_ci_fully_checkpointed
> time_after(journal->j_trans_id, ci->ci_last_trans)
> + journal is empty, crash!
>
> For fixing, there are three solutions:
>
> 1> Revert commit da5e7c87827e8
>
> For avoiding kernel crash, this make sense for us. We only concerned
> whether there has any non-system inode access before dlm init. The
> answer is NO. And all journal replay/recovery handling happen after
> dlm & journal init done. So this method is not graceful but workable.
>
> 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)
>
> The fix code is special for mounting phase, but it will continue
> working after mounting stage. In another word, this method adds useless
> code in normal inode free flow.
>
> 3> Do directly free inode in mounting phase
>
> This method is brutal/complex and may introduce unsafe code, currently
> maintainer didn't like.
>
> At last, we chose method <1> and partly reverted commit da5e7c87827e8.
> We reverted journal init code, and kept cleanup code flow.
>
> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
> fs/ocfs2/inode.c | 4 ++--
> fs/ocfs2/journal.c | 21 +--------------------
> fs/ocfs2/super.c | 33 +++++++++++++++++++++++++++++++++
> 3 files changed, 36 insertions(+), 22 deletions(-)
>
> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> index 5739dc301569..bb116c39b581 100644
> --- a/fs/ocfs2/inode.c
> +++ b/fs/ocfs2/inode.c
> @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
> struct inode *inode = NULL;
> struct super_block *sb = osb->sb;
> struct ocfs2_find_inode_args args;
> + journal_t *journal = osb->journal->j_journal;
>
> trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
> sysfile_type);
> @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
> * part of the transaction - the inode could have been reclaimed and
> * now it is reread from disk.
> */
> - if (osb->journal) {
> + if (journal) {
> transaction_t *transaction;
> tid_t tid;
> struct ocfs2_inode_info *oi = OCFS2_I(inode);
> - journal_t *journal = osb->journal->j_journal;
>
> read_lock(&journal->j_state_lock);
> if (journal->j_running_transaction)
> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
> index 1887a2708709..afb85de3bb60 100644
> --- a/fs/ocfs2/journal.c
> +++ b/fs/ocfs2/journal.c
> @@ -815,30 +815,11 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
> int status = -1;
> struct inode *inode = NULL; /* the journal inode */
> journal_t *j_journal = NULL;
> - struct ocfs2_journal *journal = NULL;
> + struct ocfs2_journal *journal = osb->journal;
> struct ocfs2_dinode *di = NULL;
> struct buffer_head *bh = NULL;
> int inode_lock = 0;
>
> - /* initialize our journal structure */
> - journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
> - if (!journal) {
> - mlog(ML_ERROR, "unable to alloc journal\n");
> - status = -ENOMEM;
> - goto done;
> - }
> - osb->journal = journal;
> - journal->j_osb = osb;
> -
> - atomic_set(&journal->j_num_trans, 0);
> - init_rwsem(&journal->j_trans_barrier);
> - init_waitqueue_head(&journal->j_checkpointed);
> - spin_lock_init(&journal->j_lock);
> - journal->j_trans_id = 1UL;
> - INIT_LIST_HEAD(&journal->j_la_cleanups);
> - INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
> - journal->j_state = OCFS2_JOURNAL_FREE;
> -
> /* already have the inode for our journal */
> inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
> osb->slot_num);
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 477cdf94122e..5f63a2333e52 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> int i, cbits, bbits;
> struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
> struct inode *inode = NULL;
> + struct ocfs2_journal *journal;
> struct ocfs2_super *osb;
> u64 total_blocks;
>
> @@ -2195,6 +2196,32 @@ static int ocfs2_initialize_super(struct super_block *sb,
>
> get_random_bytes(&osb->s_next_generation, sizeof(u32));
>
> + /* FIXME
> + * This should be done in ocfs2_journal_init(), but unknown
> + * ordering issues will cause the filesystem to crash.
> + * If anyone wants to figure out what part of the code
> + * refers to osb->journal before ocfs2_journal_init() is run,
> + * be my guest.
> + */
> + /* initialize our journal structure */
> + journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
> + if (!journal) {
> + mlog(ML_ERROR, "unable to alloc journal\n");
> + status = -ENOMEM;
> + goto bail;
> + }
> + osb->journal = journal;
> + journal->j_osb = osb;
> +
> + atomic_set(&journal->j_num_trans, 0);
> + init_rwsem(&journal->j_trans_barrier);
> + init_waitqueue_head(&journal->j_checkpointed);
> + spin_lock_init(&journal->j_lock);
> + journal->j_trans_id = 1UL;
> + INIT_LIST_HEAD(&journal->j_la_cleanups);
> + INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
> + journal->j_state = OCFS2_JOURNAL_FREE;
> +
We may fold these into a new function, such as ocfs2_journal_alloc().
Thanks,
Joseph
> INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
> init_llist_head(&osb->dquot_drop_list);
>
> @@ -2483,6 +2510,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
>
> kfree(osb->osb_orphan_wipes);
> kfree(osb->slot_recovery_generations);
> + /* FIXME
> + * This belongs in journal shutdown, but because we have to
> + * allocate osb->journal at the middle of ocfs2_initialize_super(),
> + * we free it here.
> + */
> + kfree(osb->journal);
> kfree(osb->local_alloc_copy);
> kfree(osb->uuid_str);
> kfree(osb->vol_label);
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
@ 2022-04-09 13:14 ` Joseph Qi via Ocfs2-devel
2022-04-09 16:29 ` heming.zhao--- via Ocfs2-devel
0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 13:14 UTC (permalink / raw)
To: Heming Zhao, ocfs2-devel
On 4/8/22 6:30 PM, Heming Zhao wrote:
> For rewrite error handling of ocfs2_initialize_super, change this
> function return type from int to void.
>
Since ocfs2_resmap_init() always return 0, change it to void.
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
> fs/ocfs2/reservations.c | 4 +---
> fs/ocfs2/reservations.h | 5 ++---
> fs/ocfs2/super.c | 6 +-----
> 3 files changed, 4 insertions(+), 11 deletions(-)
>
> diff --git a/fs/ocfs2/reservations.c b/fs/ocfs2/reservations.c
> index 769e466887b0..a9d1296d736d 100644
> --- a/fs/ocfs2/reservations.c
> +++ b/fs/ocfs2/reservations.c
> @@ -198,7 +198,7 @@ void ocfs2_resv_set_type(struct ocfs2_alloc_reservation *resv,
> resv->r_flags |= flags;
> }
>
> -int ocfs2_resmap_init(struct ocfs2_super *osb,
> +void ocfs2_resmap_init(struct ocfs2_super *osb,
> struct ocfs2_reservation_map *resmap)
> {
> memset(resmap, 0, sizeof(*resmap));
> @@ -207,8 +207,6 @@ int ocfs2_resmap_init(struct ocfs2_super *osb,
> resmap->m_reservations = RB_ROOT;
> /* m_bitmap_len is initialized to zero by the above memset. */
> INIT_LIST_HEAD(&resmap->m_lru);
> -
> - return 0;
> }
>
> static void ocfs2_resv_mark_lru(struct ocfs2_reservation_map *resmap,
> diff --git a/fs/ocfs2/reservations.h b/fs/ocfs2/reservations.h
> index 677c50663595..532d42eb6db7 100644
> --- a/fs/ocfs2/reservations.h
> +++ b/fs/ocfs2/reservations.h
> @@ -78,10 +78,9 @@ void ocfs2_resv_discard(struct ocfs2_reservation_map *resmap,
> * @ops: unused for now
> * @max_bitmap_bytes: Maximum size of the bitmap (typically blocksize)
> *
> - * Only possible return value other than '0' is -ENOMEM for failure to
> - * allocation mirror bitmap.
> + * This function does initialize resmap job.
> */
This comments is meaningless, so just remove it looks better.
Thanks,
Joseph
> -int ocfs2_resmap_init(struct ocfs2_super *osb,
> +void ocfs2_resmap_init(struct ocfs2_super *osb,
> struct ocfs2_reservation_map *resmap);
>
> /**
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 8391592817f8..f91c5510bc7e 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2111,11 +2111,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>
> init_waitqueue_head(&osb->osb_mount_event);
>
> - status = ocfs2_resmap_init(osb, &osb->osb_la_resmap);
> - if (status) {
> - mlog_errno(status);
> - goto bail;
> - }
> + ocfs2_resmap_init(osb, &osb->osb_la_resmap);
>
> osb->vol_label = kmalloc(OCFS2_MAX_VOL_LABEL_LEN, GFP_KERNEL);
> if (!osb->vol_label) {
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
@ 2022-04-09 13:30 ` Joseph Qi via Ocfs2-devel
2022-04-09 16:47 ` heming.zhao--- via Ocfs2-devel
0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 13:30 UTC (permalink / raw)
To: Heming Zhao, ocfs2-devel
On 4/8/22 6:30 PM, Heming Zhao wrote:
> After this patch, when error, ocfs2_fill_super doesn't take care to
> release resources which are allocated in ocfs2_initialize_super.
>
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
> fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
> 1 file changed, 40 insertions(+), 18 deletions(-)
>
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index f91c5510bc7e..8443ba031dec 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> if (!osb) {
> status = -ENOMEM;
> mlog_errno(status);
> - goto bail;
> + goto out;
> }
>
> sb->s_fs_info = osb;
> @@ -2084,7 +2084,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> mlog(ML_ERROR, "Invalid number of node slots (%u)\n",
> osb->max_slots);
> status = -EINVAL;
> - goto bail;
> + goto out;
> }
>
> ocfs2_orphan_scan_init(osb);
> @@ -2093,7 +2093,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> if (status) {
> mlog(ML_ERROR, "Unable to initialize recovery state\n");
> mlog_errno(status);
> - goto bail;
> + goto out;
> }
>
> init_waitqueue_head(&osb->checkpoint_event);
> @@ -2117,7 +2117,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> if (!osb->vol_label) {
> mlog(ML_ERROR, "unable to alloc vol label\n");
> status = -ENOMEM;
> - goto bail;
> + goto out_recovery_map;
> }
>
> osb->slot_recovery_generations =
> @@ -2126,7 +2126,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> if (!osb->slot_recovery_generations) {
> status = -ENOMEM;
> mlog_errno(status);
> - goto bail;
> + goto out_vol_label;
> }
>
> init_waitqueue_head(&osb->osb_wipe_event);
> @@ -2136,7 +2136,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> if (!osb->osb_orphan_wipes) {
> status = -ENOMEM;
> mlog_errno(status);
> - goto bail;
> + goto out_slot_recovery_gen;
> }
>
> osb->osb_rf_lock_tree = RB_ROOT;
> @@ -2152,13 +2152,13 @@ static int ocfs2_initialize_super(struct super_block *sb,
> mlog(ML_ERROR, "couldn't mount because of unsupported "
> "optional features (%x).\n", i);
> status = -EINVAL;
> - goto bail;
> + goto out_orphan_wipes;
> }
> if (!sb_rdonly(osb->sb) && (i = OCFS2_HAS_RO_COMPAT_FEATURE(osb->sb, ~OCFS2_FEATURE_RO_COMPAT_SUPP))) {
> mlog(ML_ERROR, "couldn't mount RDWR because of "
> "unsupported optional features (%x).\n", i);
> status = -EINVAL;
> - goto bail;
> + goto out_orphan_wipes;
> }
>
> if (ocfs2_clusterinfo_valid(osb)) {
> @@ -2179,7 +2179,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> "cluster stack label (%s) \n",
> osb->osb_cluster_stack);
> status = -EINVAL;
> - goto bail;
> + goto out_orphan_wipes;
> }
> memcpy(osb->osb_cluster_name,
> OCFS2_RAW_SB(di)->s_cluster_info.ci_cluster,
> @@ -2204,7 +2204,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> if (!journal) {
> mlog(ML_ERROR, "unable to alloc journal\n");
> status = -ENOMEM;
> - goto bail;
> + goto out_orphan_wipes;
> }
> osb->journal = journal;
> journal->j_osb = osb;
> @@ -2231,7 +2231,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> mlog(ML_ERROR, "Volume has invalid cluster size (%d)\n",
> osb->s_clustersize);
> status = -EINVAL;
> - goto bail;
> + goto out_journal;
> }
>
> total_blocks = ocfs2_clusters_to_blocks(osb->sb,
> @@ -2243,14 +2243,14 @@ static int ocfs2_initialize_super(struct super_block *sb,
> mlog(ML_ERROR, "Volume too large "
> "to mount safely on this system");
> status = -EFBIG;
> - goto bail;
> + goto out_journal;
> }
>
> if (ocfs2_setup_osb_uuid(osb, di->id2.i_super.s_uuid,
> sizeof(di->id2.i_super.s_uuid))) {
> mlog(ML_ERROR, "Out of memory trying to setup our uuid.\n");
> status = -ENOMEM;
> - goto bail;
> + goto out_journal;
> }
>
> strlcpy(osb->vol_label, di->id2.i_super.s_label,
> @@ -2270,7 +2270,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> if (!osb->osb_dlm_debug) {
> status = -ENOMEM;
> mlog_errno(status);
> - goto bail;
> + goto out_uuid_str;
> }
>
> atomic_set(&osb->vol_state, VOLUME_INIT);
> @@ -2279,7 +2279,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> status = ocfs2_init_global_system_inodes(osb);
> if (status < 0) {
> mlog_errno(status);
> - goto bail;
> + goto out_dlm_out;
> }
>
> /*
> @@ -2290,7 +2290,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
> if (!inode) {
> status = -EINVAL;
> mlog_errno(status);
> - goto bail;
> + goto out_system_inodes;
> }
>
> osb->bitmap_blkno = OCFS2_I(inode)->ip_blkno;
> @@ -2303,16 +2303,38 @@ static int ocfs2_initialize_super(struct super_block *sb,
> status = ocfs2_init_slot_info(osb);
> if (status < 0) {
> mlog_errno(status);
> - goto bail;
> + goto out_system_inodes;
> }
>
> osb->ocfs2_wq = alloc_ordered_workqueue("ocfs2_wq", WQ_MEM_RECLAIM);
> if (!osb->ocfs2_wq) {
> status = -ENOMEM;
> mlog_errno(status);
> + goto out_slot_info;
> }
>
> -bail:
> + return status;
> +
> +out_slot_info:
> + ocfs2_free_slot_info(osb);
> +out_system_inodes:
> + ocfs2_release_system_inodes(osb);
> +out_dlm_out:
> + ocfs2_put_dlm_debug(osb->osb_dlm_debug);
> +out_uuid_str:
> + kfree(osb->uuid_str);
> +out_journal:
> + kfree(osb->journal);
> +out_orphan_wipes:
> + kfree(osb->osb_orphan_wipes);
> +out_slot_recovery_gen:
> + kfree(osb->slot_recovery_generations);
> +out_vol_label:
> + kfree(osb->vol_label);
> +out_recovery_map:
> + kfree(osb->recovery_map);
> +out:
> + kfree(osb);
Should set osb to NULL here to prevent UAF in ocfs2_fill_super().
Thanks,
Joseph
> return status;
> }
>
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
@ 2022-04-09 13:46 ` Joseph Qi via Ocfs2-devel
2022-04-10 3:58 ` heming.zhao--- via Ocfs2-devel
2022-04-12 8:22 ` Heming Zhao via Ocfs2-devel
0 siblings, 2 replies; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 13:46 UTC (permalink / raw)
To: Heming Zhao, ocfs2-devel
On 4/8/22 6:30 PM, Heming Zhao wrote:
> After this patch, when error, ocfs2_fill_super doesn't take care to
> release resources which are allocated in ocfs2_mount_volume.
>
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
> fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
> 1 file changed, 20 insertions(+), 12 deletions(-)
>
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 8443ba031dec..d4b7a29cb364 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
> static int ocfs2_mount_volume(struct super_block *sb)
> {
> int status = 0;
> - int unlock_super = 0;
> struct ocfs2_super *osb = OCFS2_SB(sb);
>
> if (ocfs2_is_hard_readonly(osb))
> - goto leave;
> + goto out;
>
> mutex_init(&osb->obs_trim_fs_mutex);
>
> @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
> if (status == -EBADR && ocfs2_userspace_stack(osb))
> mlog(ML_ERROR, "couldn't mount because cluster name on"
> " disk does not match the running cluster name.\n");
> - goto leave;
> + goto out;
> }
>
> status = ocfs2_super_lock(osb, 1);
> if (status < 0) {
> mlog_errno(status);
> - goto leave;
> + goto out_dlm;
> }
> - unlock_super = 1;
>
> /* This will load up the node map and add ourselves to it. */
> status = ocfs2_find_slot(osb);
> if (status < 0) {
> mlog_errno(status);
> - goto leave;
> + goto out_super_lock;
> }
>
> /* load all node-local system inodes */
> status = ocfs2_init_local_system_inodes(osb);
> if (status < 0) {
> mlog_errno(status);
> - goto leave;
> + goto out_super_lock;
> }
>
> status = ocfs2_check_volume(osb);
> if (status < 0) {
> mlog_errno(status);
> - goto leave;
> + goto out_system_inodes;
> }
>
> status = ocfs2_truncate_log_init(osb);
> - if (status < 0)
> + if (status < 0) {
> mlog_errno(status);
> + goto out_journal_shutdown;
ocfs2_check_volume() not only does journal initialization,
but also local alloc.
> + }
>
> -leave:
> - if (unlock_super)
> - ocfs2_super_unlock(osb, 1);
> + ocfs2_super_unlock(osb, 1);
> + return status;
Explicitly return 0 may be better, which means this is the
normal path.
Thanks,
Joseph
>
> +out_journal_shutdown:
> + ocfs2_journal_shutdown(osb);
> +out_system_inodes:
> + ocfs2_release_system_inodes(osb);
> +out_super_lock:
> + ocfs2_super_unlock(osb, 1);
> +out_dlm:
> + ocfs2_dlm_shutdown(osb, 0);
> +out:
> return status;
> }
>
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
@ 2022-04-09 14:02 ` Joseph Qi via Ocfs2-devel
2022-04-10 10:25 ` heming.zhao--- via Ocfs2-devel
2022-04-12 8:49 ` Heming Zhao via Ocfs2-devel
0 siblings, 2 replies; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-09 14:02 UTC (permalink / raw)
To: Heming Zhao, ocfs2-devel
On 4/8/22 6:30 PM, Heming Zhao wrote:
> current ocfs2_fill_super error handling is very simple, and includes
> kernel crash possibility.
>
> e.g.
>
> ocfs2_fill_super
> ocfs2_initialize_super // fails in check max_slots & return
>
> Then osb->osb_mount_event is not properly initialized, and then will
> crash at wake_up(&osb->osb_mount_event) in goto label read_super_error.
>
With patch 3, this case won't exist any more.
> For fixing the issue and enhancing error handling, we rewrite error
> handling of ocfs2_fill_super.
>
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
> fs/ocfs2/super.c | 67 ++++++++++++++++++++++++------------------------
> 1 file changed, 33 insertions(+), 34 deletions(-)
>
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index d4b7a29cb364..c3efd9ad49ac 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>
> if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
> status = -EINVAL;
> - goto read_super_error;
> + goto out;
> }
>
> /* probe for superblock */
> status = ocfs2_sb_probe(sb, &bh, §or_size, &stats);
> if (status < 0) {
> mlog(ML_ERROR, "superblock probe failed!\n");
> - goto read_super_error;
> + goto out;
> }
>
> status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
> - osb = OCFS2_SB(sb);
> - if (status < 0) {
> - mlog_errno(status);
> - goto read_super_error;
> - }
> brelse(bh);
> bh = NULL;
> + if (status < 0)
> + goto out;
> +
> + osb = OCFS2_SB(sb);
>
> if (!ocfs2_check_set_options(sb, &parsed_options)) {
> status = -EINVAL;
> - goto read_super_error;
> + goto out_super;
> }
> osb->s_mount_opt = parsed_options.mount_opt;
> osb->s_atime_quantum = parsed_options.atime_quantum;
> @@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>
> status = ocfs2_verify_userspace_stack(osb, &parsed_options);
> if (status)
> - goto read_super_error;
> + goto out_super;
>
> sb->s_magic = OCFS2_SUPER_MAGIC;
>
> @@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> status = -EACCES;
> mlog(ML_ERROR, "Readonly device detected but readonly "
> "mount was not specified.\n");
> - goto read_super_error;
> + goto out_super;
> }
>
> /* You should not be able to start a local heartbeat
> @@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> status = -EROFS;
> mlog(ML_ERROR, "Local heartbeat specified on readonly "
> "device.\n");
> - goto read_super_error;
> + goto out_super;
> }
>
> status = ocfs2_check_journals_nolocks(osb);
> @@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> mlog(ML_ERROR, "Recovery required on readonly "
> "file system, but write access is "
> "unavailable.\n");
> - else
> - mlog_errno(status);
> - goto read_super_error;
> + goto out_super;
> }
>
> ocfs2_set_ro_flag(osb, 1);
> @@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> }
>
> status = ocfs2_verify_heartbeat(osb);
> - if (status < 0) {
> - mlog_errno(status);
> - goto read_super_error;
> - }
> + if (status < 0)
> + goto out_super;
>
> osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
> ocfs2_debugfs_root);
> @@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>
> status = ocfs2_mount_volume(sb);
> if (status < 0)
> - goto read_super_error;
> + goto out_debugfs;
>
> if (osb->root_inode)
> inode = igrab(osb->root_inode);
>
> if (!inode) {
> status = -EIO;
> - mlog_errno(status);
> - goto read_super_error;
> + goto out_journal;
> }
>
> osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
> @@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> if (!osb->osb_dev_kset) {
> status = -ENOMEM;
> mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
> - goto read_super_error;
> + goto out_journal;
> }
>
> /* Create filecheck sysfs related directories/files at
> @@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> status = -ENOMEM;
> mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
> "/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
> - goto read_super_error;
> + goto out_journal;
> }
>
> root = d_make_root(inode);
> if (!root) {
> status = -ENOMEM;
> - mlog_errno(status);
> - goto read_super_error;
> + goto out_journal;
> }
>
> sb->s_root = root;
> @@ -1178,17 +1171,23 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>
> return status;
>
> -read_super_error:
> - brelse(bh);
> +out_journal:
rename it to out_dismount?
> + mlog_errno(status);
> + atomic_set(&osb->vol_state, VOLUME_DISABLED);
> + wake_up(&osb->osb_mount_event);
> + ocfs2_dismount_volume(sb, 1);
>
> - if (status)
> - mlog_errno(status);
> + return status;
Or remove mlog_errno() before and then just goto out?
>
> - if (osb) {
> - atomic_set(&osb->vol_state, VOLUME_DISABLED);
> - wake_up(&osb->osb_mount_event);
> - ocfs2_dismount_volume(sb, 1);
> - }
> +out_debugfs:
> + debugfs_remove_recursive(osb->osb_debug_root);
> +out_super:
> + ocfs2_release_system_inodes(osb);
> + kfree(osb->recovery_map);
> + ocfs2_delete_osb(osb);
> + kfree(osb);
Seems we have to take care all resources in ocfs2_initialize_super()?
> +out:
> + mlog_errno(status);
>
As I described in your previous mail, ocfs2_sb_probe() may return error
with non-NULL bh, so we should brelse here.
Thanks,
Joseph
> return status;
> }
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue
2022-04-09 13:11 ` Joseph Qi via Ocfs2-devel
@ 2022-04-09 16:28 ` heming.zhao--- via Ocfs2-devel
2022-04-10 12:00 ` Joseph Qi via Ocfs2-devel
0 siblings, 1 reply; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-09 16:28 UTC (permalink / raw)
To: Joseph Qi, ocfs2-devel
On 4/9/22 21:11, Joseph Qi wrote:
> Suggest rename the subject, something like
> ocfs2: fix mount crash if journal is not alloced
OK.
>
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
>> journal init later than before, it makes NULL pointer access in free
>> routine.
>>
>> Crash flow:
>>
>> ocfs2_fill_super
>> + ocfs2_mount_volume
>> | + ocfs2_dlm_init //fail & return, osb->journal is NULL.
>> | + ...
>> | + ocfs2_check_volume //no chance to init osb->journal
>> |
>> + ...
>> + ocfs2_dismount_volume
>> ocfs2_release_system_inodes
>> ...
>> evict
>> ...
>> ocfs2_clear_inode
>> ocfs2_checkpoint_inode
>> ocfs2_ci_fully_checkpointed
>> time_after(journal->j_trans_id, ci->ci_last_trans)
>> + journal is empty, crash!
>>
>> For fixing, there are three solutions:
>>
>> 1> Revert commit da5e7c87827e8
>>
>> For avoiding kernel crash, this make sense for us. We only concerned
>> whether there has any non-system inode access before dlm init. The
>> answer is NO. And all journal replay/recovery handling happen after
>> dlm & journal init done. So this method is not graceful but workable.
>>
>> 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)
>>
>> The fix code is special for mounting phase, but it will continue
>> working after mounting stage. In another word, this method adds useless
>> code in normal inode free flow.
>>
>> 3> Do directly free inode in mounting phase
>>
>> This method is brutal/complex and may introduce unsafe code, currently
>> maintainer didn't like.
>>
>> At last, we chose method <1> and partly reverted commit da5e7c87827e8.
>> We reverted journal init code, and kept cleanup code flow.
>>
>> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>> fs/ocfs2/inode.c | 4 ++--
>> fs/ocfs2/journal.c | 21 +--------------------
>> fs/ocfs2/super.c | 33 +++++++++++++++++++++++++++++++++
>> 3 files changed, 36 insertions(+), 22 deletions(-)
>>
>> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
>> index 5739dc301569..bb116c39b581 100644
>> --- a/fs/ocfs2/inode.c
>> +++ b/fs/ocfs2/inode.c
>> @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>> struct inode *inode = NULL;
>> struct super_block *sb = osb->sb;
>> struct ocfs2_find_inode_args args;
>> + journal_t *journal = osb->journal->j_journal;
>>
>> trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
>> sysfile_type);
>> @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>> * part of the transaction - the inode could have been reclaimed and
>> * now it is reread from disk.
>> */
>> - if (osb->journal) {
>> + if (journal) {
>> transaction_t *transaction;
>> tid_t tid;
>> struct ocfs2_inode_info *oi = OCFS2_I(inode);
>> - journal_t *journal = osb->journal->j_journal;
>>
>> read_lock(&journal->j_state_lock);
>> if (journal->j_running_transaction)
>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>> index 1887a2708709..afb85de3bb60 100644
>> --- a/fs/ocfs2/journal.c
>> +++ b/fs/ocfs2/journal.c
>> @@ -815,30 +815,11 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
>> int status = -1;
>> struct inode *inode = NULL; /* the journal inode */
>> journal_t *j_journal = NULL;
>> - struct ocfs2_journal *journal = NULL;
>> + struct ocfs2_journal *journal = osb->journal;
>> struct ocfs2_dinode *di = NULL;
>> struct buffer_head *bh = NULL;
>> int inode_lock = 0;
>>
>> - /* initialize our journal structure */
>> - journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>> - if (!journal) {
>> - mlog(ML_ERROR, "unable to alloc journal\n");
>> - status = -ENOMEM;
>> - goto done;
>> - }
>> - osb->journal = journal;
>> - journal->j_osb = osb;
>> -
>> - atomic_set(&journal->j_num_trans, 0);
>> - init_rwsem(&journal->j_trans_barrier);
>> - init_waitqueue_head(&journal->j_checkpointed);
>> - spin_lock_init(&journal->j_lock);
>> - journal->j_trans_id = 1UL;
>> - INIT_LIST_HEAD(&journal->j_la_cleanups);
>> - INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>> - journal->j_state = OCFS2_JOURNAL_FREE;
>> -
>> /* already have the inode for our journal */
>> inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
>> osb->slot_num);
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index 477cdf94122e..5f63a2333e52 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>> int i, cbits, bbits;
>> struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>> struct inode *inode = NULL;
>> + struct ocfs2_journal *journal;
>> struct ocfs2_super *osb;
>> u64 total_blocks;
>>
>> @@ -2195,6 +2196,32 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>
>> get_random_bytes(&osb->s_next_generation, sizeof(u32));
>>
>> + /* FIXME
>> + * This should be done in ocfs2_journal_init(), but unknown
>> + * ordering issues will cause the filesystem to crash.
>> + * If anyone wants to figure out what part of the code
>> + * refers to osb->journal before ocfs2_journal_init() is run,
>> + * be my guest.
>> + */
>> + /* initialize our journal structure */
>> + journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>> + if (!journal) {
>> + mlog(ML_ERROR, "unable to alloc journal\n");
>> + status = -ENOMEM;
>> + goto bail;
>> + }
>> + osb->journal = journal;
>> + journal->j_osb = osb;
>> +
>> + atomic_set(&journal->j_num_trans, 0);
>> + init_rwsem(&journal->j_trans_barrier);
>> + init_waitqueue_head(&journal->j_checkpointed);
>> + spin_lock_init(&journal->j_lock);
>> + journal->j_trans_id = 1UL;
>> + INIT_LIST_HEAD(&journal->j_la_cleanups);
>> + INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>> + journal->j_state = OCFS2_JOURNAL_FREE;
>> +
> We may fold these into a new function, such as ocfs2_journal_alloc().
>
OK, will create a new function in journal.c
For this area FIXME comment, the legacy comment humor & vague. In my view, we
already got the order issue clearly, why not we change it as below?
/*
* FIXME
* This should be done in ocfs2_journal_init(), but any inode
* writes back operation will cause the filesystem to crash.
*/
Thanks,
Heming
>
>> INIT_WORK(&osb->dquot_drop_work, ocfs2_drop_dquot_refs);
>> init_llist_head(&osb->dquot_drop_list);
>>
>> @@ -2483,6 +2510,12 @@ static void ocfs2_delete_osb(struct ocfs2_super *osb)
>>
>> kfree(osb->osb_orphan_wipes);
>> kfree(osb->slot_recovery_generations);
>> + /* FIXME
>> + * This belongs in journal shutdown, but because we have to
>> + * allocate osb->journal at the middle of ocfs2_initialize_super(),
>> + * we free it here.
>> + */
>> + kfree(osb->journal);
>> kfree(osb->local_alloc_copy);
>> kfree(osb->uuid_str);
>> kfree(osb->vol_label);
>
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init
2022-04-09 13:14 ` Joseph Qi via Ocfs2-devel
@ 2022-04-09 16:29 ` heming.zhao--- via Ocfs2-devel
0 siblings, 0 replies; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-09 16:29 UTC (permalink / raw)
To: Joseph Qi, ocfs2-devel
On 4/9/22 21:14, Joseph Qi wrote:
>
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> For rewrite error handling of ocfs2_initialize_super, change this
>> function return type from int to void.
>>
> Since ocfs2_resmap_init() always return 0, change it to void.
OK.
>
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>> fs/ocfs2/reservations.c | 4 +---
>> fs/ocfs2/reservations.h | 5 ++---
>> fs/ocfs2/super.c | 6 +-----
>> 3 files changed, 4 insertions(+), 11 deletions(-)
>>
>> diff --git a/fs/ocfs2/reservations.c b/fs/ocfs2/reservations.c
>> index 769e466887b0..a9d1296d736d 100644
>> --- a/fs/ocfs2/reservations.c
>> +++ b/fs/ocfs2/reservations.c
>> @@ -198,7 +198,7 @@ void ocfs2_resv_set_type(struct ocfs2_alloc_reservation *resv,
>> resv->r_flags |= flags;
>> }
>>
>> -int ocfs2_resmap_init(struct ocfs2_super *osb,
>> +void ocfs2_resmap_init(struct ocfs2_super *osb,
>> struct ocfs2_reservation_map *resmap)
>> {
>> memset(resmap, 0, sizeof(*resmap));
>> @@ -207,8 +207,6 @@ int ocfs2_resmap_init(struct ocfs2_super *osb,
>> resmap->m_reservations = RB_ROOT;
>> /* m_bitmap_len is initialized to zero by the above memset. */
>> INIT_LIST_HEAD(&resmap->m_lru);
>> -
>> - return 0;
>> }
>>
>> static void ocfs2_resv_mark_lru(struct ocfs2_reservation_map *resmap,
>> diff --git a/fs/ocfs2/reservations.h b/fs/ocfs2/reservations.h
>> index 677c50663595..532d42eb6db7 100644
>> --- a/fs/ocfs2/reservations.h
>> +++ b/fs/ocfs2/reservations.h
>> @@ -78,10 +78,9 @@ void ocfs2_resv_discard(struct ocfs2_reservation_map *resmap,
>> * @ops: unused for now
>> * @max_bitmap_bytes: Maximum size of the bitmap (typically blocksize)
>> *
>> - * Only possible return value other than '0' is -ENOMEM for failure to
>> - * allocation mirror bitmap.
>> + * This function does initialize resmap job.
>> */
> This comments is meaningless, so just remove it looks better.
OK.
- Heming
>
>> -int ocfs2_resmap_init(struct ocfs2_super *osb,
>> +void ocfs2_resmap_init(struct ocfs2_super *osb,
>> struct ocfs2_reservation_map *resmap);
>>
>> /**
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index 8391592817f8..f91c5510bc7e 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -2111,11 +2111,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>
>> init_waitqueue_head(&osb->osb_mount_event);
>>
>> - status = ocfs2_resmap_init(osb, &osb->osb_la_resmap);
>> - if (status) {
>> - mlog_errno(status);
>> - goto bail;
>> - }
>> + ocfs2_resmap_init(osb, &osb->osb_la_resmap);
>>
>> osb->vol_label = kmalloc(OCFS2_MAX_VOL_LABEL_LEN, GFP_KERNEL);
>> if (!osb->vol_label) {
>
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
2022-04-09 13:30 ` Joseph Qi via Ocfs2-devel
@ 2022-04-09 16:47 ` heming.zhao--- via Ocfs2-devel
2022-04-11 1:56 ` Joseph Qi via Ocfs2-devel
0 siblings, 1 reply; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-09 16:47 UTC (permalink / raw)
To: Joseph Qi, ocfs2-devel
On 4/9/22 21:30, Joseph Qi wrote:
>
>
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> After this patch, when error, ocfs2_fill_super doesn't take care to
>> release resources which are allocated in ocfs2_initialize_super.
>>
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>> fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
>> 1 file changed, 40 insertions(+), 18 deletions(-)
>>
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index f91c5510bc7e..8443ba031dec 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>> if (!osb) {
>> status = -ENOMEM;
>> mlog_errno(status);
>> - goto bail;
>> + goto out;
>> }
>> ... ...
>>
>> -bail:
>> + return status;
>> +
>> +out_slot_info:
>> + ocfs2_free_slot_info(osb);
>> +out_system_inodes:
>> + ocfs2_release_system_inodes(osb);
>> +out_dlm_out:
>> + ocfs2_put_dlm_debug(osb->osb_dlm_debug);
>> +out_uuid_str:
>> + kfree(osb->uuid_str);
>> +out_journal:
>> + kfree(osb->journal);
>> +out_orphan_wipes:
>> + kfree(osb->osb_orphan_wipes);
>> +out_slot_recovery_gen:
>> + kfree(osb->slot_recovery_generations);
>> +out_vol_label:
>> + kfree(osb->vol_label);
>> +out_recovery_map:
>> + kfree(osb->recovery_map);
>> +out:
>> + kfree(osb);
>
> Should set osb to NULL here to prevent UAF in ocfs2_fill_super().
>
Your concern only valid with patch 1/5+2/5+3/5, but after 5/5, the UAF won't be
triggered. ocfs2_initialize_super() failure will directly jump (by "goto out")
to return.
Thanks,
Heming
>
>> return status;
>> }
>>
>
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
2022-04-09 13:46 ` Joseph Qi via Ocfs2-devel
@ 2022-04-10 3:58 ` heming.zhao--- via Ocfs2-devel
2022-04-12 8:22 ` Heming Zhao via Ocfs2-devel
1 sibling, 0 replies; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-10 3:58 UTC (permalink / raw)
To: Joseph Qi, ocfs2-devel
On 4/9/22 21:46, Joseph Qi wrote:
>
>
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> After this patch, when error, ocfs2_fill_super doesn't take care to
>> release resources which are allocated in ocfs2_mount_volume.
>>
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>> fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
>> 1 file changed, 20 insertions(+), 12 deletions(-)
>>
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index 8443ba031dec..d4b7a29cb364 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
>> static int ocfs2_mount_volume(struct super_block *sb)
>> {
>> int status = 0;
>> - int unlock_super = 0;
>> struct ocfs2_super *osb = OCFS2_SB(sb);
>>
>> if (ocfs2_is_hard_readonly(osb))
>> - goto leave;
>> + goto out;
>>
>> mutex_init(&osb->obs_trim_fs_mutex);
>>
>> @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
>> if (status == -EBADR && ocfs2_userspace_stack(osb))
>> mlog(ML_ERROR, "couldn't mount because cluster name on"
>> " disk does not match the running cluster name.\n");
>> - goto leave;
>> + goto out;
>> }
>>
>> status = ocfs2_super_lock(osb, 1);
>> if (status < 0) {
>> mlog_errno(status);
>> - goto leave;
>> + goto out_dlm;
>> }
>> - unlock_super = 1;
>>
>> /* This will load up the node map and add ourselves to it. */
>> status = ocfs2_find_slot(osb);
>> if (status < 0) {
>> mlog_errno(status);
>> - goto leave;
>> + goto out_super_lock;
>> }
>>
>> /* load all node-local system inodes */
>> status = ocfs2_init_local_system_inodes(osb);
>> if (status < 0) {
>> mlog_errno(status);
>> - goto leave;
>> + goto out_super_lock;
>> }
>>
>> status = ocfs2_check_volume(osb);
>> if (status < 0) {
>> mlog_errno(status);
>> - goto leave;
>> + goto out_system_inodes;
>> }
>>
>> status = ocfs2_truncate_log_init(osb);
>> - if (status < 0)
>> + if (status < 0) {
>> mlog_errno(status);
>> + goto out_journal_shutdown;
>
> ocfs2_check_volume() not only does journal initialization,
> but also local alloc.
I agree and already handled this case. please read following explanation.
call flow:
ocfs2_fill_super
ocfs2_mount_volume
+ ocfs2_init_local_system_inodes //[1]
+ ocfs2_check_volume
ocfs2_load_local_alloc //may fail
If ocfs2_load_local_alloc fails, only "//local_alloc:%04d" is bad.
And other alloced inodes (from [1]) still exist.
also note, the sys inodes release function is ocfs2_release_system_inodes(),
which could handle both osb->global_system_inodes[] & osb->local_system_inodes
[]
are NULL pointer case.
the cleanup job will be done in both places:
- ocfs2_mount_volume (patch 4/5)
- ocfs2_fill_super (patch 5/5)
related code with patch 4/5:
```ocfs2_mount_volume()
... ...
status = ocfs2_check_volume(osb);
if (status < 0) {
mlog_errno(status);
//need to release global/local sys inodes (by ocfs2_release_system_inodes())
// which alloced by above [1].
goto out_system_inodes;
}
status = ocfs2_truncate_log_init(osb);
if (status < 0) {
mlog_errno(status);
//need to release journal: by ocfs2_journal_shutdown()
//need to release global/local sys inodes: by ocfs2_release_system_inodes()
goto out_journal_shutdown;
}
ocfs2_super_unlock(osb, 1);
return 0;
out_journal_shutdown:
ocfs2_journal_shutdown(osb); //through to label "out_system_inodes".
out_system_inodes:
ocfs2_release_system_inodes(osb);
out_super_lock:
ocfs2_super_unlock(osb, 1);
out_dlm:
ocfs2_dlm_shutdown(osb, 0);
out:
return status;
}
```
>
>> + }
>>
>> -leave:
>> - if (unlock_super)
>> - ocfs2_super_unlock(osb, 1);
>> + ocfs2_super_unlock(osb, 1);
>> + return status;
> Explicitly return 0 may be better, which means this is the
> normal path.
OK
Thanks,
Heming
>
>>
>> +out_journal_shutdown:
>> + ocfs2_journal_shutdown(osb);
>> +out_system_inodes:
>> + ocfs2_release_system_inodes(osb);
>> +out_super_lock:
>> + ocfs2_super_unlock(osb, 1);
>> +out_dlm:
>> + ocfs2_dlm_shutdown(osb, 0);
>> +out:
>> return status;
>> }
>>
>
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
2022-04-09 14:02 ` Joseph Qi via Ocfs2-devel
@ 2022-04-10 10:25 ` heming.zhao--- via Ocfs2-devel
2022-04-12 8:49 ` Heming Zhao via Ocfs2-devel
1 sibling, 0 replies; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-10 10:25 UTC (permalink / raw)
To: Joseph Qi, ocfs2-devel
On 4/9/22 22:02, Joseph Qi wrote:
>
>
> On 4/8/22 6:30 PM, Heming Zhao wrote:
>> current ocfs2_fill_super error handling is very simple, and includes
>> kernel crash possibility.
>>
>> e.g.
>>
>> ocfs2_fill_super
>> ocfs2_initialize_super // fails in check max_slots & return
>>
>> Then osb->osb_mount_event is not properly initialized, and then will
>> crash at wake_up(&osb->osb_mount_event) in goto label read_super_error.
>>
> With patch 3, this case won't exist any more.
Agree. After patch 3, ocfs2_initialize_super() will free osb before returning
failure, and this issue won't be triggered.
I will change it in next version.
>
>> For fixing the issue and enhancing error handling, we rewrite error
>> handling of ocfs2_fill_super.
>>
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>> fs/ocfs2/super.c | 67 ++++++++++++++++++++++++------------------------
>> 1 file changed, 33 insertions(+), 34 deletions(-)
>>
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index d4b7a29cb364..c3efd9ad49ac 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>
>> if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
>> status = -EINVAL;
>> - goto read_super_error;
>> + goto out;
>> }
>>
>> /* probe for superblock */
>> status = ocfs2_sb_probe(sb, &bh, §or_size, &stats);
>> if (status < 0) {
>> mlog(ML_ERROR, "superblock probe failed!\n");
>> - goto read_super_error;
>> + goto out;
>> }
>>
>> status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
>> - osb = OCFS2_SB(sb);
>> - if (status < 0) {
>> - mlog_errno(status);
>> - goto read_super_error;
>> - }
>> brelse(bh);
>> bh = NULL;
>> + if (status < 0)
>> + goto out;
>> +
>> + osb = OCFS2_SB(sb);
>>
>> if (!ocfs2_check_set_options(sb, &parsed_options)) {
>> status = -EINVAL;
>> - goto read_super_error;
>> + goto out_super;
>> }
>> osb->s_mount_opt = parsed_options.mount_opt;
>> osb->s_atime_quantum = parsed_options.atime_quantum;
>> @@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>
>> status = ocfs2_verify_userspace_stack(osb, &parsed_options);
>> if (status)
>> - goto read_super_error;
>> + goto out_super;
>>
>> sb->s_magic = OCFS2_SUPER_MAGIC;
>>
>> @@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>> status = -EACCES;
>> mlog(ML_ERROR, "Readonly device detected but readonly "
>> "mount was not specified.\n");
>> - goto read_super_error;
>> + goto out_super;
>> }
>>
>> /* You should not be able to start a local heartbeat
>> @@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>> status = -EROFS;
>> mlog(ML_ERROR, "Local heartbeat specified on readonly "
>> "device.\n");
>> - goto read_super_error;
>> + goto out_super;
>> }
>>
>> status = ocfs2_check_journals_nolocks(osb);
>> @@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>> mlog(ML_ERROR, "Recovery required on readonly "
>> "file system, but write access is "
>> "unavailable.\n");
>> - else
>> - mlog_errno(status);
>> - goto read_super_error;
>> + goto out_super;
>> }
>>
>> ocfs2_set_ro_flag(osb, 1);
>> @@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>> }
>>
>> status = ocfs2_verify_heartbeat(osb);
>> - if (status < 0) {
>> - mlog_errno(status);
>> - goto read_super_error;
>> - }
>> + if (status < 0)
>> + goto out_super;
>>
>> osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
>> ocfs2_debugfs_root);
>> @@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>
>> status = ocfs2_mount_volume(sb);
>> if (status < 0)
>> - goto read_super_error;
>> + goto out_debugfs;
>>
>> if (osb->root_inode)
>> inode = igrab(osb->root_inode);
>>
>> if (!inode) {
>> status = -EIO;
>> - mlog_errno(status);
>> - goto read_super_error;
>> + goto out_journal;
>> }
>>
>> osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
>> @@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>> if (!osb->osb_dev_kset) {
>> status = -ENOMEM;
>> mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
>> - goto read_super_error;
>> + goto out_journal;
>> }
>>
>> /* Create filecheck sysfs related directories/files at
>> @@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>> status = -ENOMEM;
>> mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
>> "/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
>> - goto read_super_error;
>> + goto out_journal;
>> }
>>
>> root = d_make_root(inode);
>> if (!root) {
>> status = -ENOMEM;
>> - mlog_errno(status);
>> - goto read_super_error;
>> + goto out_journal;
>> }
>>
>> sb->s_root = root;
>> @@ -1178,17 +1171,23 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
>>
>> return status;
>>
>> -read_super_error:
>> - brelse(bh);
>> +out_journal:
>
> rename it to out_dismount?
agree.
>
>> + mlog_errno(status);
>> + atomic_set(&osb->vol_state, VOLUME_DISABLED);
>> + wake_up(&osb->osb_mount_event);
>> + ocfs2_dismount_volume(sb, 1);
>>
>> - if (status)
>> - mlog_errno(status);
>> + return status;
>
> Or remove mlog_errno() before and then just goto out?
your mean: at the end of lable "out_journal", call "goto out"?
if yes, no problem for me.
>
>>
>> - if (osb) {
>> - atomic_set(&osb->vol_state, VOLUME_DISABLED);
>> - wake_up(&osb->osb_mount_event);
>> - ocfs2_dismount_volume(sb, 1);
>> - }
>> +out_debugfs:
>> + debugfs_remove_recursive(osb->osb_debug_root);
>> +out_super:
>> + ocfs2_release_system_inodes(osb);
>> + kfree(osb->recovery_map);
>> + ocfs2_delete_osb(osb);
>> + kfree(osb);
>
> Seems we have to take care all resources in ocfs2_initialize_super()?
Yes, I checked every lines.
for ocfs2_initialize_super(), I lists all my found resources:
(list in time order):
osb, osb->recovery_map, osb->vol_label, osb->slot_recovery_generations,
osb->osb_orphan_wipes, osb->journal, osb->uuid_str
- all above have itselves out_xx label.
osb->osb_dlm_debug & dlm sysfs stuffs.
- release by out_dlm_out
system inodes (by ocfs2_init_global_system_inodes)
- release by out_system_inodes
osb->slot_info, osb->slot_info->si_bh
- release by out_slot_info
workqueue:osb->ocfs2_wq
- will be freed in ocfs2_fill_super label "out_super": ocfs2_delete_osb().
or be freed in ocfs2_dismount_volume => ocfs2_delete_osb().
>
>> +out:
>> + mlog_errno(status);
>>
> As I described in your previous mail, ocfs2_sb_probe() may return error
> with non-NULL bh, so we should brelse here.
>
I also explained why I dropped the brelse(bh), maybe anti-spam system ate my mail.
(I found my emails also disappeared in other mailling lists.)
I re-explain here. below code based on patch 5/5.
please note <2> & <4>.
```
static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
{
... ...
if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
status = -EINVAL;
goto out;
}
//hm: <1> bh is alloced/created from here.
/* probe for superblock */
status = ocfs2_sb_probe(sb, &bh, §or_size, &stats);
if (status < 0) {
mlog(ML_ERROR, "superblock probe failed!\n");
//hm: <2> if error, ocfs2_sb_probe() will release bh. So "out" label
//doesn't need to call brelse(bh).
goto out;
}
status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
brelse(bh); //hm: <3> release bh before assessment status.
bh = NULL;
if (status < 0)
goto out; //hm: <4> if error, bh is already released before "if()",
// So "out" label does't need to call brelse(bh)
osb = OCFS2_SB(sb);
... ...
out:
mlog_errno(status);
return status;
}
```
Thanks,
Heming
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue
2022-04-09 16:28 ` heming.zhao--- via Ocfs2-devel
@ 2022-04-10 12:00 ` Joseph Qi via Ocfs2-devel
0 siblings, 0 replies; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-10 12:00 UTC (permalink / raw)
To: heming.zhao, ocfs2-devel
On 4/10/22 12:28 AM, heming.zhao@suse.com wrote:
> On 4/9/22 21:11, Joseph Qi wrote:
>> Suggest rename the subject, something like
>> ocfs2: fix mount crash if journal is not alloced
>
> OK.
>
>>
>> On 4/8/22 6:30 PM, Heming Zhao wrote:
>>> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
>>> journal init later than before, it makes NULL pointer access in free
>>> routine.
>>>
>>> Crash flow:
>>>
>>> ocfs2_fill_super
>>> + ocfs2_mount_volume
>>> | + ocfs2_dlm_init //fail & return, osb->journal is NULL.
>>> | + ...
>>> | + ocfs2_check_volume //no chance to init osb->journal
>>> |
>>> + ...
>>> + ocfs2_dismount_volume
>>> ocfs2_release_system_inodes
>>> ...
>>> evict
>>> ...
>>> ocfs2_clear_inode
>>> ocfs2_checkpoint_inode
>>> ocfs2_ci_fully_checkpointed
>>> time_after(journal->j_trans_id, ci->ci_last_trans)
>>> + journal is empty, crash!
>>>
>>> For fixing, there are three solutions:
>>>
>>> 1> Revert commit da5e7c87827e8
>>>
>>> For avoiding kernel crash, this make sense for us. We only concerned
>>> whether there has any non-system inode access before dlm init. The
>>> answer is NO. And all journal replay/recovery handling happen after
>>> dlm & journal init done. So this method is not graceful but workable.
>>>
>>> 2> Add osb->journal check in free inode routine (eg ocfs2_clear_inode)
>>>
>>> The fix code is special for mounting phase, but it will continue
>>> working after mounting stage. In another word, this method adds useless
>>> code in normal inode free flow.
>>>
>>> 3> Do directly free inode in mounting phase
>>>
>>> This method is brutal/complex and may introduce unsafe code, currently
>>> maintainer didn't like.
>>>
>>> At last, we chose method <1> and partly reverted commit da5e7c87827e8.
>>> We reverted journal init code, and kept cleanup code flow.
>>>
>>> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
>>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>>> ---
>>> fs/ocfs2/inode.c | 4 ++--
>>> fs/ocfs2/journal.c | 21 +--------------------
>>> fs/ocfs2/super.c | 33 +++++++++++++++++++++++++++++++++
>>> 3 files changed, 36 insertions(+), 22 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
>>> index 5739dc301569..bb116c39b581 100644
>>> --- a/fs/ocfs2/inode.c
>>> +++ b/fs/ocfs2/inode.c
>>> @@ -125,6 +125,7 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>>> struct inode *inode = NULL;
>>> struct super_block *sb = osb->sb;
>>> struct ocfs2_find_inode_args args;
>>> + journal_t *journal = osb->journal->j_journal;
>>> trace_ocfs2_iget_begin((unsigned long long)blkno, flags,
>>> sysfile_type);
>>> @@ -171,11 +172,10 @@ struct inode *ocfs2_iget(struct ocfs2_super *osb, u64 blkno, unsigned flags,
>>> * part of the transaction - the inode could have been reclaimed and
>>> * now it is reread from disk.
>>> */
>>> - if (osb->journal) {
>>> + if (journal) {
>>> transaction_t *transaction;
>>> tid_t tid;
>>> struct ocfs2_inode_info *oi = OCFS2_I(inode);
>>> - journal_t *journal = osb->journal->j_journal;
>>> read_lock(&journal->j_state_lock);
>>> if (journal->j_running_transaction)
>>> diff --git a/fs/ocfs2/journal.c b/fs/ocfs2/journal.c
>>> index 1887a2708709..afb85de3bb60 100644
>>> --- a/fs/ocfs2/journal.c
>>> +++ b/fs/ocfs2/journal.c
>>> @@ -815,30 +815,11 @@ int ocfs2_journal_init(struct ocfs2_super *osb, int *dirty)
>>> int status = -1;
>>> struct inode *inode = NULL; /* the journal inode */
>>> journal_t *j_journal = NULL;
>>> - struct ocfs2_journal *journal = NULL;
>>> + struct ocfs2_journal *journal = osb->journal;
>>> struct ocfs2_dinode *di = NULL;
>>> struct buffer_head *bh = NULL;
>>> int inode_lock = 0;
>>> - /* initialize our journal structure */
>>> - journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>>> - if (!journal) {
>>> - mlog(ML_ERROR, "unable to alloc journal\n");
>>> - status = -ENOMEM;
>>> - goto done;
>>> - }
>>> - osb->journal = journal;
>>> - journal->j_osb = osb;
>>> -
>>> - atomic_set(&journal->j_num_trans, 0);
>>> - init_rwsem(&journal->j_trans_barrier);
>>> - init_waitqueue_head(&journal->j_checkpointed);
>>> - spin_lock_init(&journal->j_lock);
>>> - journal->j_trans_id = 1UL;
>>> - INIT_LIST_HEAD(&journal->j_la_cleanups);
>>> - INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>>> - journal->j_state = OCFS2_JOURNAL_FREE;
>>> -
>>> /* already have the inode for our journal */
>>> inode = ocfs2_get_system_file_inode(osb, JOURNAL_SYSTEM_INODE,
>>> osb->slot_num);
>>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>>> index 477cdf94122e..5f63a2333e52 100644
>>> --- a/fs/ocfs2/super.c
>>> +++ b/fs/ocfs2/super.c
>>> @@ -2015,6 +2015,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>> int i, cbits, bbits;
>>> struct ocfs2_dinode *di = (struct ocfs2_dinode *)bh->b_data;
>>> struct inode *inode = NULL;
>>> + struct ocfs2_journal *journal;
>>> struct ocfs2_super *osb;
>>> u64 total_blocks;
>>> @@ -2195,6 +2196,32 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>> get_random_bytes(&osb->s_next_generation, sizeof(u32));
>>> + /* FIXME
>>> + * This should be done in ocfs2_journal_init(), but unknown
>>> + * ordering issues will cause the filesystem to crash.
>>> + * If anyone wants to figure out what part of the code
>>> + * refers to osb->journal before ocfs2_journal_init() is run,
>>> + * be my guest.
>>> + */
>>> + /* initialize our journal structure */
>>> + journal = kzalloc(sizeof(struct ocfs2_journal), GFP_KERNEL);
>>> + if (!journal) {
>>> + mlog(ML_ERROR, "unable to alloc journal\n");
>>> + status = -ENOMEM;
>>> + goto bail;
>>> + }
>>> + osb->journal = journal;
>>> + journal->j_osb = osb;
>>> +
>>> + atomic_set(&journal->j_num_trans, 0);
>>> + init_rwsem(&journal->j_trans_barrier);
>>> + init_waitqueue_head(&journal->j_checkpointed);
>>> + spin_lock_init(&journal->j_lock);
>>> + journal->j_trans_id = 1UL;
>>> + INIT_LIST_HEAD(&journal->j_la_cleanups);
>>> + INIT_WORK(&journal->j_recovery_work, ocfs2_complete_recovery);
>>> + journal->j_state = OCFS2_JOURNAL_FREE;
>>> +
>> We may fold these into a new function, such as ocfs2_journal_alloc().
>>
>
> OK, will create a new function in journal.c
> For this area FIXME comment, the legacy comment humor & vague. In my view, we
> already got the order issue clearly, why not we change it as below?
> /*
> * FIXME
> * This should be done in ocfs2_journal_init(), but any inode
> * writes back operation will cause the filesystem to crash.
> */
Sounds good.
Thanks,
Joseph
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
2022-04-09 16:47 ` heming.zhao--- via Ocfs2-devel
@ 2022-04-11 1:56 ` Joseph Qi via Ocfs2-devel
2022-04-11 2:09 ` heming.zhao--- via Ocfs2-devel
0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-11 1:56 UTC (permalink / raw)
To: heming.zhao, ocfs2-devel
On 4/10/22 12:47 AM, heming.zhao@suse.com wrote:
> On 4/9/22 21:30, Joseph Qi wrote:
>>
>>
>> On 4/8/22 6:30 PM, Heming Zhao wrote:
>>> After this patch, when error, ocfs2_fill_super doesn't take care to
>>> release resources which are allocated in ocfs2_initialize_super.
>>>
>>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>>> ---
>>> fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
>>> 1 file changed, 40 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>>> index f91c5510bc7e..8443ba031dec 100644
>>> --- a/fs/ocfs2/super.c
>>> +++ b/fs/ocfs2/super.c
>>> @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>> if (!osb) {
>>> status = -ENOMEM;
>>> mlog_errno(status);
>>> - goto bail;
>>> + goto out;
>>> }
>>> ... ...
>>> -bail:
>>> + return status;
>>> +
>>> +out_slot_info:
>>> + ocfs2_free_slot_info(osb);
>>> +out_system_inodes:
>>> + ocfs2_release_system_inodes(osb);
>>> +out_dlm_out:
>>> + ocfs2_put_dlm_debug(osb->osb_dlm_debug);
>>> +out_uuid_str:
>>> + kfree(osb->uuid_str);
>>> +out_journal:
>>> + kfree(osb->journal);
>>> +out_orphan_wipes:
>>> + kfree(osb->osb_orphan_wipes);
>>> +out_slot_recovery_gen:
>>> + kfree(osb->slot_recovery_generations);
>>> +out_vol_label:
>>> + kfree(osb->vol_label);
>>> +out_recovery_map:
>>> + kfree(osb->recovery_map);
>>> +out:
>>> + kfree(osb);
>>
>> Should set osb to NULL here to prevent UAF in ocfs2_fill_super().
>>
>
> Your concern only valid with patch 1/5+2/5+3/5, but after 5/5, the UAF won't be
> triggered. ocfs2_initialize_super() failure will directly jump (by "goto out")
> to return.
>
Right, but we'd better make this patch look more complete.
Thanks,
Joseph
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error
2022-04-11 1:56 ` Joseph Qi via Ocfs2-devel
@ 2022-04-11 2:09 ` heming.zhao--- via Ocfs2-devel
0 siblings, 0 replies; 23+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-04-11 2:09 UTC (permalink / raw)
To: Joseph Qi, ocfs2-devel
On 4/11/22 09:56, Joseph Qi wrote:
>
>
> On 4/10/22 12:47 AM, heming.zhao@suse.com wrote:
>> On 4/9/22 21:30, Joseph Qi wrote:
>>>
>>>
>>> On 4/8/22 6:30 PM, Heming Zhao wrote:
>>>> After this patch, when error, ocfs2_fill_super doesn't take care to
>>>> release resources which are allocated in ocfs2_initialize_super.
>>>>
>>>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>>>> ---
>>>> fs/ocfs2/super.c | 58 +++++++++++++++++++++++++++++++++---------------
>>>> 1 file changed, 40 insertions(+), 18 deletions(-)
>>>>
>>>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>>>> index f91c5510bc7e..8443ba031dec 100644
>>>> --- a/fs/ocfs2/super.c
>>>> +++ b/fs/ocfs2/super.c
>>>> @@ -2023,7 +2023,7 @@ static int ocfs2_initialize_super(struct super_block *sb,
>>>> if (!osb) {
>>>> status = -ENOMEM;
>>>> mlog_errno(status);
>>>> - goto bail;
>>>> + goto out;
>>>> }
>>>> ... ...
>>>> -bail:
>>>> + return status;
>>>> +
>>>> +out_slot_info:
>>>> + ocfs2_free_slot_info(osb);
>>>> +out_system_inodes:
>>>> + ocfs2_release_system_inodes(osb);
>>>> +out_dlm_out:
>>>> + ocfs2_put_dlm_debug(osb->osb_dlm_debug);
>>>> +out_uuid_str:
>>>> + kfree(osb->uuid_str);
>>>> +out_journal:
>>>> + kfree(osb->journal);
>>>> +out_orphan_wipes:
>>>> + kfree(osb->osb_orphan_wipes);
>>>> +out_slot_recovery_gen:
>>>> + kfree(osb->slot_recovery_generations);
>>>> +out_vol_label:
>>>> + kfree(osb->vol_label);
>>>> +out_recovery_map:
>>>> + kfree(osb->recovery_map);
>>>> +out:
>>>> + kfree(osb);
>>>
>>> Should set osb to NULL here to prevent UAF in ocfs2_fill_super().
>>>
>>
>> Your concern only valid with patch 1/5+2/5+3/5, but after 5/5, the UAF won't be
>> triggered. ocfs2_initialize_super() failure will directly jump (by "goto out")
>> to return.
>>
>
> Right, but we'd better make this patch look more complete.
>
OK, I will make a complete patch on next verion.
And I found Thunderbird automatically adds '\n' in reply mail, which make the mail
format messy. I am sorry for this, and try to switch from Thunderbird to mutt.
Thanks,
Heming
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
2022-04-09 13:46 ` Joseph Qi via Ocfs2-devel
2022-04-10 3:58 ` heming.zhao--- via Ocfs2-devel
@ 2022-04-12 8:22 ` Heming Zhao via Ocfs2-devel
2022-04-12 11:54 ` Joseph Qi via Ocfs2-devel
1 sibling, 1 reply; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-12 8:22 UTC (permalink / raw)
To: Joseph Qi; +Cc: ocfs2-devel
My last reply (on 4.10) had messy format, because thunderbird automatically
add '\r' after '\n'. For easy review, I use neomutt to resend my replay.
Please check my comments in below.
On Sat, Apr 09, 2022 at 09:46:19PM +0800, Joseph Qi wrote:
>
>
> On 4/8/22 6:30 PM, Heming Zhao wrote:
> > After this patch, when error, ocfs2_fill_super doesn't take care to
> > release resources which are allocated in ocfs2_mount_volume.
> >
> > Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> > ---
> > fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
> > 1 file changed, 20 insertions(+), 12 deletions(-)
> >
> > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> > index 8443ba031dec..d4b7a29cb364 100644
> > --- a/fs/ocfs2/super.c
> > +++ b/fs/ocfs2/super.c
> > @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
> > static int ocfs2_mount_volume(struct super_block *sb)
> > {
> > int status = 0;
> > - int unlock_super = 0;
> > struct ocfs2_super *osb = OCFS2_SB(sb);
> >
> > if (ocfs2_is_hard_readonly(osb))
> > - goto leave;
> > + goto out;
> >
> > mutex_init(&osb->obs_trim_fs_mutex);
> >
> > @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
> > if (status == -EBADR && ocfs2_userspace_stack(osb))
> > mlog(ML_ERROR, "couldn't mount because cluster name on"
> > " disk does not match the running cluster name.\n");
> > - goto leave;
> > + goto out;
> > }
> >
> > status = ocfs2_super_lock(osb, 1);
> > if (status < 0) {
> > mlog_errno(status);
> > - goto leave;
> > + goto out_dlm;
> > }
> > - unlock_super = 1;
> >
> > /* This will load up the node map and add ourselves to it. */
> > status = ocfs2_find_slot(osb);
> > if (status < 0) {
> > mlog_errno(status);
> > - goto leave;
> > + goto out_super_lock;
> > }
> >
> > /* load all node-local system inodes */
> > status = ocfs2_init_local_system_inodes(osb);
> > if (status < 0) {
> > mlog_errno(status);
> > - goto leave;
> > + goto out_super_lock;
> > }
> >
> > status = ocfs2_check_volume(osb);
> > if (status < 0) {
> > mlog_errno(status);
> > - goto leave;
> > + goto out_system_inodes;
> > }
> >
> > status = ocfs2_truncate_log_init(osb);
> > - if (status < 0)
> > + if (status < 0) {
> > mlog_errno(status);
> > + goto out_journal_shutdown;
>
> ocfs2_check_volume() not only does journal initialization,
> but also local alloc.
>
I agree and already handled this case. Please read following explanation.
Call flow:
ocfs2_fill_super
ocfs2_mount_volume
+ ocfs2_init_local_system_inodes //[1]
+ ocfs2_check_volume
ocfs2_load_local_alloc //may fail
If ocfs2_load_local_alloc fails, only "//local_alloc:%04d" is bad.
And other alloced inodes (from [1]) still exist.
Also note, the sys inodes release function is ocfs2_release_system_inodes(),
which could handle both osb->global_system_inodes[] & osb->local_system_inodes[]
are NULL pointer case.
The cleanup job will be done in both places:
- ocfs2_mount_volume (patch 4/5)
- ocfs2_fill_super (patch 5/5)
Related code after patched 4/5:
(please check my comment start with "hm:")
```
static int ocfs2_mount_volume(struct super_block *sb)
{
... ...
status = ocfs2_check_volume(osb);
if (status < 0) {
mlog_errno(status);
//hm:
//need to release global/local sys inodes (by //ocfs2_release_system_inodes())
//which are alloced by above [1].
goto out_system_inodes;
}
status = ocfs2_truncate_log_init(osb);
if (status < 0) {
mlog_errno(status);
//hm:
//need to release journal: by ocfs2_journal_shutdown()
//need to release global/local sys inodes: byocfs2_release_system_inodes()
goto out_journal_shutdown;
}
ocfs2_super_unlock(osb, 1);
return status; //<== hm: will change to 0 under next comment
out_journal_shutdown:
ocfs2_journal_shutdown(osb); //hm: through to label "out_system_inodes".
out_system_inodes:
ocfs2_release_system_inodes(osb);
out_super_lock:
ocfs2_super_unlock(osb, 1);
out_dlm:
ocfs2_dlm_shutdown(osb, 0);
out:
return status;
}
```
> > + }
> >
> > -leave:
> > - if (unlock_super)
> > - ocfs2_super_unlock(osb, 1);
> > + ocfs2_super_unlock(osb, 1);
> > + return status;
> Explicitly return 0 may be better, which means this is the
> normal path.
>
OK.
- Heming
>
> >
> > +out_journal_shutdown:
> > + ocfs2_journal_shutdown(osb);
> > +out_system_inodes:
> > + ocfs2_release_system_inodes(osb);
> > +out_super_lock:
> > + ocfs2_super_unlock(osb, 1);
> > +out_dlm:
> > + ocfs2_dlm_shutdown(osb, 0);
> > +out:
> > return status;
> > }
> >
>
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super
2022-04-09 14:02 ` Joseph Qi via Ocfs2-devel
2022-04-10 10:25 ` heming.zhao--- via Ocfs2-devel
@ 2022-04-12 8:49 ` Heming Zhao via Ocfs2-devel
1 sibling, 0 replies; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-12 8:49 UTC (permalink / raw)
To: Joseph Qi, ocfs2-devel
Resend my reply for easy review.
On Sat, Apr 09, 2022 at 10:02:16PM +0800, Joseph Qi wrote:
>
>
> On 4/8/22 6:30 PM, Heming Zhao wrote:
> > current ocfs2_fill_super error handling is very simple, and includes
> > kernel crash possibility.
> >
> > e.g.
> >
> > ocfs2_fill_super
> > ocfs2_initialize_super // fails in check max_slots & return
> >
> > Then osb->osb_mount_event is not properly initialized, and then will
> > crash at wake_up(&osb->osb_mount_event) in goto label read_super_error.
> >
> With patch 3, this case won't exist any more.
Agree. After patch 3, ocfs2_initialize_super() will free osb before returning
failure, and this issue won't be triggered.
I will change the commit log in next version.
>
> > For fixing the issue and enhancing error handling, we rewrite error
> > handling of ocfs2_fill_super.
> >
> > Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> > ---
> > fs/ocfs2/super.c | 67 ++++++++++++++++++++++++------------------------
> > 1 file changed, 33 insertions(+), 34 deletions(-)
> >
> > diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> > index d4b7a29cb364..c3efd9ad49ac 100644
> > --- a/fs/ocfs2/super.c
> > +++ b/fs/ocfs2/super.c
> > @@ -989,28 +989,27 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >
> > if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
> > status = -EINVAL;
> > - goto read_super_error;
> > + goto out;
> > }
> >
> > /* probe for superblock */
> > status = ocfs2_sb_probe(sb, &bh, §or_size, &stats);
> > if (status < 0) {
> > mlog(ML_ERROR, "superblock probe failed!\n");
> > - goto read_super_error;
> > + goto out;
> > }
> >
> > status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
> > - osb = OCFS2_SB(sb);
> > - if (status < 0) {
> > - mlog_errno(status);
> > - goto read_super_error;
> > - }
> > brelse(bh);
> > bh = NULL;
> > + if (status < 0)
> > + goto out;
> > +
> > + osb = OCFS2_SB(sb);
> >
> > if (!ocfs2_check_set_options(sb, &parsed_options)) {
> > status = -EINVAL;
> > - goto read_super_error;
> > + goto out_super;
> > }
> > osb->s_mount_opt = parsed_options.mount_opt;
> > osb->s_atime_quantum = parsed_options.atime_quantum;
> > @@ -1027,7 +1026,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >
> > status = ocfs2_verify_userspace_stack(osb, &parsed_options);
> > if (status)
> > - goto read_super_error;
> > + goto out_super;
> >
> > sb->s_magic = OCFS2_SUPER_MAGIC;
> >
> > @@ -1041,7 +1040,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> > status = -EACCES;
> > mlog(ML_ERROR, "Readonly device detected but readonly "
> > "mount was not specified.\n");
> > - goto read_super_error;
> > + goto out_super;
> > }
> >
> > /* You should not be able to start a local heartbeat
> > @@ -1050,7 +1049,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> > status = -EROFS;
> > mlog(ML_ERROR, "Local heartbeat specified on readonly "
> > "device.\n");
> > - goto read_super_error;
> > + goto out_super;
> > }
> >
> > status = ocfs2_check_journals_nolocks(osb);
> > @@ -1059,9 +1058,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> > mlog(ML_ERROR, "Recovery required on readonly "
> > "file system, but write access is "
> > "unavailable.\n");
> > - else
> > - mlog_errno(status);
> > - goto read_super_error;
> > + goto out_super;
> > }
> >
> > ocfs2_set_ro_flag(osb, 1);
> > @@ -1077,10 +1074,8 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> > }
> >
> > status = ocfs2_verify_heartbeat(osb);
> > - if (status < 0) {
> > - mlog_errno(status);
> > - goto read_super_error;
> > - }
> > + if (status < 0)
> > + goto out_super;
> >
> > osb->osb_debug_root = debugfs_create_dir(osb->uuid_str,
> > ocfs2_debugfs_root);
> > @@ -1094,15 +1089,14 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >
> > status = ocfs2_mount_volume(sb);
> > if (status < 0)
> > - goto read_super_error;
> > + goto out_debugfs;
> >
> > if (osb->root_inode)
> > inode = igrab(osb->root_inode);
> >
> > if (!inode) {
> > status = -EIO;
> > - mlog_errno(status);
> > - goto read_super_error;
> > + goto out_journal;
> > }
> >
> > osb->osb_dev_kset = kset_create_and_add(sb->s_id, NULL,
> > @@ -1110,7 +1104,7 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> > if (!osb->osb_dev_kset) {
> > status = -ENOMEM;
> > mlog(ML_ERROR, "Unable to create device kset %s.\n", sb->s_id);
> > - goto read_super_error;
> > + goto out_journal;
> > }
> >
> > /* Create filecheck sysfs related directories/files at
> > @@ -1119,14 +1113,13 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> > status = -ENOMEM;
> > mlog(ML_ERROR, "Unable to create filecheck sysfs directory at "
> > "/sys/fs/ocfs2/%s/filecheck.\n", sb->s_id);
> > - goto read_super_error;
> > + goto out_journal;
> > }
> >
> > root = d_make_root(inode);
> > if (!root) {
> > status = -ENOMEM;
> > - mlog_errno(status);
> > - goto read_super_error;
> > + goto out_journal;
> > }
> >
> > sb->s_root = root;
> > @@ -1178,17 +1171,23 @@ static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
> >
> > return status;
> >
> > -read_super_error:
> > - brelse(bh);
> > +out_journal:
>
> rename it to out_dismount?
agree.
>
> > + mlog_errno(status);
> > + atomic_set(&osb->vol_state, VOLUME_DISABLED);
> > + wake_up(&osb->osb_mount_event);
> > + ocfs2_dismount_volume(sb, 1);
> >
> > - if (status)
> > - mlog_errno(status);
> > + return status;
>
> Or remove mlog_errno() before and then just goto out?
Your mean: at the end of lable "out_journal", call "goto out"?
If yes, no problem for me.
>
> >
> > - if (osb) {
> > - atomic_set(&osb->vol_state, VOLUME_DISABLED);
> > - wake_up(&osb->osb_mount_event);
> > - ocfs2_dismount_volume(sb, 1);
> > - }
> > +out_debugfs:
> > + debugfs_remove_recursive(osb->osb_debug_root);
> > +out_super:
> > + ocfs2_release_system_inodes(osb);
> > + kfree(osb->recovery_map);
> > + ocfs2_delete_osb(osb);
> > + kfree(osb);
>
> Seems we have to take care all resources in ocfs2_initialize_super()?
Yes, I checked every line.
for ocfs2_initialize_super(), I lists all my found resources:
(list in time order):
osb, osb->recovery_map, osb->vol_label, osb->slot_recovery_generations,
osb->osb_orphan_wipes, osb->journal, osb->uuid_str
- all above have itselves out_xx label.
osb->osb_dlm_debug & dlm sysfs stuffs.
- release by out_dlm_out label
system inodes (by ocfs2_init_global_system_inodes)
- release by out_system_inodes label
osb->slot_info, osb->slot_info->si_bh
- release by out_slot_info label
workqueue:osb->ocfs2_wq
- will be freed in ocfs2_fill_super label "out_super": ocfs2_delete_osb().
or be freed in ocfs2_dismount_volume => ocfs2_delete_osb().
>
> > +out:
> > + mlog_errno(status);
> >
> As I described in your previous mail, ocfs2_sb_probe() may return error
> with non-NULL bh, so we should brelse here.
>
I also explained why I dropped the brelse(bh), maybe anti-spam system
ate my mail. (I found my emails also disappeared in other mailling lists.)
I re-explain here. Below code based on patch 5/5. Please note <2> & <4>.
```
static int ocfs2_fill_super(struct super_block *sb, void *data, int silent)
{
... ...
if (!ocfs2_parse_options(sb, data, &parsed_options, 0)) {
status = -EINVAL;
goto out;
}
//hm: <1> bh is alloced/created from here.
/* probe for superblock */
status = ocfs2_sb_probe(sb, &bh, §or_size, &stats);
if (status < 0) {
mlog(ML_ERROR, "superblock probe failed!\n");
//hm: <2> if error, ocfs2_sb_probe() will release bh. So "out" label
// doesn't need to call brelse(bh).
goto out;
}
status = ocfs2_initialize_super(sb, bh, sector_size, &stats);
brelse(bh); //hm: <3> release bh before assessment status.
bh = NULL;
if (status < 0)
goto out; //hm: <4> if error, bh is already released before "if()",
// So "out" label does't need to call brelse(bh)
osb = OCFS2_SB(sb);
... ...
out: //hm: no need to call brelse(bh)
mlog_errno(status);
return status;
}
```
Thanks,
Heming
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
2022-04-12 8:22 ` Heming Zhao via Ocfs2-devel
@ 2022-04-12 11:54 ` Joseph Qi via Ocfs2-devel
2022-04-12 16:46 ` Heming Zhao via Ocfs2-devel
0 siblings, 1 reply; 23+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-04-12 11:54 UTC (permalink / raw)
To: Heming Zhao; +Cc: ocfs2-devel
On 4/12/22 4:22 PM, Heming Zhao wrote:
> My last reply (on 4.10) had messy format, because thunderbird automatically
> add '\r' after '\n'. For easy review, I use neomutt to resend my replay.
> Please check my comments in below.
>
> On Sat, Apr 09, 2022 at 09:46:19PM +0800, Joseph Qi wrote:
>>
>>
>> On 4/8/22 6:30 PM, Heming Zhao wrote:
>>> After this patch, when error, ocfs2_fill_super doesn't take care to
>>> release resources which are allocated in ocfs2_mount_volume.
>>>
>>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>>> ---
>>> fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
>>> 1 file changed, 20 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>>> index 8443ba031dec..d4b7a29cb364 100644
>>> --- a/fs/ocfs2/super.c
>>> +++ b/fs/ocfs2/super.c
>>> @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
>>> static int ocfs2_mount_volume(struct super_block *sb)
>>> {
>>> int status = 0;
>>> - int unlock_super = 0;
>>> struct ocfs2_super *osb = OCFS2_SB(sb);
>>>
>>> if (ocfs2_is_hard_readonly(osb))
>>> - goto leave;
>>> + goto out;
>>>
>>> mutex_init(&osb->obs_trim_fs_mutex);
>>>
>>> @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
>>> if (status == -EBADR && ocfs2_userspace_stack(osb))
>>> mlog(ML_ERROR, "couldn't mount because cluster name on"
>>> " disk does not match the running cluster name.\n");
>>> - goto leave;
>>> + goto out;
>>> }
>>>
>>> status = ocfs2_super_lock(osb, 1);
>>> if (status < 0) {
>>> mlog_errno(status);
>>> - goto leave;
>>> + goto out_dlm;
>>> }
>>> - unlock_super = 1;
>>>
>>> /* This will load up the node map and add ourselves to it. */
>>> status = ocfs2_find_slot(osb);
>>> if (status < 0) {
>>> mlog_errno(status);
>>> - goto leave;
>>> + goto out_super_lock;
>>> }
>>>
>>> /* load all node-local system inodes */
>>> status = ocfs2_init_local_system_inodes(osb);
>>> if (status < 0) {
>>> mlog_errno(status);
>>> - goto leave;
>>> + goto out_super_lock;
>>> }
>>>
>>> status = ocfs2_check_volume(osb);
>>> if (status < 0) {
>>> mlog_errno(status);
>>> - goto leave;
>>> + goto out_system_inodes;
>>> }
>>>
>>> status = ocfs2_truncate_log_init(osb);
>>> - if (status < 0)
>>> + if (status < 0) {
>>> mlog_errno(status);
>>> + goto out_journal_shutdown;
>>
>> ocfs2_check_volume() not only does journal initialization,
>> but also local alloc.
>>
>
> I agree and already handled this case. Please read following explanation.
>
> Call flow:
>
> ocfs2_fill_super
> ocfs2_mount_volume
> + ocfs2_init_local_system_inodes //[1]
> + ocfs2_check_volume
> ocfs2_load_local_alloc //may fail
>
> If ocfs2_load_local_alloc fails, only "//local_alloc:%04d" is bad.
> And other alloced inodes (from [1]) still exist.
>
> Also note, the sys inodes release function is ocfs2_release_system_inodes(),
> which could handle both osb->global_system_inodes[] & osb->local_system_inodes[]
> are NULL pointer case.
>
> The cleanup job will be done in both places:
> - ocfs2_mount_volume (patch 4/5)
> - ocfs2_fill_super (patch 5/5)
>
> Related code after patched 4/5:
> (please check my comment start with "hm:")
> ```
> static int ocfs2_mount_volume(struct super_block *sb)
> {
> ... ...
>
> status = ocfs2_check_volume(osb);
> if (status < 0) {
> mlog_errno(status);
>
> //hm:
> //need to release global/local sys inodes (by //ocfs2_release_system_inodes())
> //which are alloced by above [1].
> goto out_system_inodes;
> }
>
> status = ocfs2_truncate_log_init(osb);
> if (status < 0) {
> mlog_errno(status);
>
> //hm:
> //need to release journal: by ocfs2_journal_shutdown()
> //need to release global/local sys inodes: byocfs2_release_system_inodes()
> goto out_journal_shutdown;
> }
>
> ocfs2_super_unlock(osb, 1);
> return status; //<== hm: will change to 0 under next comment
>
> out_journal_shutdown:
> ocfs2_journal_shutdown(osb); //hm: through to label "out_system_inodes".
> out_system_inodes:
> ocfs2_release_system_inodes(osb);
> out_super_lock:
> ocfs2_super_unlock(osb, 1);
> out_dlm:
> ocfs2_dlm_shutdown(osb, 0);
> out:
> return status;
> }
> ```
I mean we may need call ocfs2_shutdown_local_alloc() to do the cleanup.
In ocfs2_fill_super(), you don't do that.
Thanks,
Joseph
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume does cleanup job before return error
2022-04-12 11:54 ` Joseph Qi via Ocfs2-devel
@ 2022-04-12 16:46 ` Heming Zhao via Ocfs2-devel
0 siblings, 0 replies; 23+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-04-12 16:46 UTC (permalink / raw)
To: Joseph Qi, ocfs2-devel
On Tue, Apr 12, 2022 at 07:54:35PM +0800, Joseph Qi wrote:
>
> On 4/12/22 4:22 PM, Heming Zhao wrote:
> > My last reply (on 4.10) had messy format, because thunderbird automatically
> > add '\r' after '\n'. For easy review, I use neomutt to resend my replay.
> > Please check my comments in below.
> >
> > On Sat, Apr 09, 2022 at 09:46:19PM +0800, Joseph Qi wrote:
> >>
> >>
> >> On 4/8/22 6:30 PM, Heming Zhao wrote:
> >>> After this patch, when error, ocfs2_fill_super doesn't take care to
> >>> release resources which are allocated in ocfs2_mount_volume.
> >>>
> >>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> >>> ---
> >>> fs/ocfs2/super.c | 32 ++++++++++++++++++++------------
> >>> 1 file changed, 20 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> >>> index 8443ba031dec..d4b7a29cb364 100644
> >>> --- a/fs/ocfs2/super.c
> >>> +++ b/fs/ocfs2/super.c
> >>> @@ -1803,11 +1803,10 @@ static int ocfs2_get_sector(struct super_block *sb,
> >>> static int ocfs2_mount_volume(struct super_block *sb)
> >>> {
> >>> int status = 0;
> >>> - int unlock_super = 0;
> >>> struct ocfs2_super *osb = OCFS2_SB(sb);
> >>>
> >>> if (ocfs2_is_hard_readonly(osb))
> >>> - goto leave;
> >>> + goto out;
> >>>
> >>> mutex_init(&osb->obs_trim_fs_mutex);
> >>>
> >>> @@ -1817,44 +1816,53 @@ static int ocfs2_mount_volume(struct super_block *sb)
> >>> if (status == -EBADR && ocfs2_userspace_stack(osb))
> >>> mlog(ML_ERROR, "couldn't mount because cluster name on"
> >>> " disk does not match the running cluster name.\n");
> >>> - goto leave;
> >>> + goto out;
> >>> }
> >>>
> >>> status = ocfs2_super_lock(osb, 1);
> >>> if (status < 0) {
> >>> mlog_errno(status);
> >>> - goto leave;
> >>> + goto out_dlm;
> >>> }
> >>> - unlock_super = 1;
> >>>
> >>> /* This will load up the node map and add ourselves to it. */
> >>> status = ocfs2_find_slot(osb);
> >>> if (status < 0) {
> >>> mlog_errno(status);
> >>> - goto leave;
> >>> + goto out_super_lock;
> >>> }
> >>>
> >>> /* load all node-local system inodes */
> >>> status = ocfs2_init_local_system_inodes(osb);
> >>> if (status < 0) {
> >>> mlog_errno(status);
> >>> - goto leave;
> >>> + goto out_super_lock;
> >>> }
> >>>
> >>> status = ocfs2_check_volume(osb);
> >>> if (status < 0) {
> >>> mlog_errno(status);
> >>> - goto leave;
> >>> + goto out_system_inodes;
> >>> }
> >>>
> >>> status = ocfs2_truncate_log_init(osb);
> >>> - if (status < 0)
> >>> + if (status < 0) {
> >>> mlog_errno(status);
> >>> + goto out_journal_shutdown;
> >>
> >> ocfs2_check_volume() not only does journal initialization,
> >> but also local alloc.
> >>
> >
> > I agree and already handled this case. Please read following explanation.
> >
> > Call flow:
> >
> > ocfs2_fill_super
> > ocfs2_mount_volume
> > + ocfs2_init_local_system_inodes //[1]
> > + ocfs2_check_volume
> > ocfs2_load_local_alloc //may fail
> >
> > If ocfs2_load_local_alloc fails, only "//local_alloc:%04d" is bad.
> > And other alloced inodes (from [1]) still exist.
> >
> > Also note, the sys inodes release function is ocfs2_release_system_inodes(),
> > which could handle both osb->global_system_inodes[] & osb->local_system_inodes[]
> > are NULL pointer case.
> >
> > The cleanup job will be done in both places:
> > - ocfs2_mount_volume (patch 4/5)
> > - ocfs2_fill_super (patch 5/5)
> >
> > Related code after patched 4/5:
> > (please check my comment start with "hm:")
> > ```
> > static int ocfs2_mount_volume(struct super_block *sb)
> > {
> > ... ...
> >
> > status = ocfs2_check_volume(osb);
> > if (status < 0) {
> > mlog_errno(status);
> >
> > //hm:
> > //need to release global/local sys inodes (by //ocfs2_release_system_inodes())
> > //which are alloced by above [1].
> > goto out_system_inodes;
> > }
> >
> > status = ocfs2_truncate_log_init(osb);
> > if (status < 0) {
> > mlog_errno(status);
> >
> > //hm:
> > //need to release journal: by ocfs2_journal_shutdown()
> > //need to release global/local sys inodes: byocfs2_release_system_inodes()
> > goto out_journal_shutdown;
> > }
> >
> > ocfs2_super_unlock(osb, 1);
> > return status; //<== hm: will change to 0 under next comment
> >
> > out_journal_shutdown:
> > ocfs2_journal_shutdown(osb); //hm: through to label "out_system_inodes".
> > out_system_inodes:
> > ocfs2_release_system_inodes(osb);
> > out_super_lock:
> > ocfs2_super_unlock(osb, 1);
> > out_dlm:
> > ocfs2_dlm_shutdown(osb, 0);
> > out:
> > return status;
> > }
> > ```
> I mean we may need call ocfs2_shutdown_local_alloc() to do the cleanup.
> In ocfs2_fill_super(), you don't do that.
>
You are right, my mistake.
I also found another mistake in this patch: "goto out_journal_shutdown" is wrong.
After ocfs2_truncate_log_init() failure, code firstly calls ocfs2_journal_shutdown()
to clean journal then calls ocfs2_release_system_inodes(), it will trigger crash.
- Heming
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2022-04-13 13:50 UTC | newest]
Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-08 10:30 [Ocfs2-devel] [PATCH v1 0/5] rewrite error handling during mounting stage Heming Zhao via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 1/5] ocfs2: partly revert da5e7c87827e8 for mounting crash issue Heming Zhao via Ocfs2-devel
2022-04-09 13:11 ` Joseph Qi via Ocfs2-devel
2022-04-09 16:28 ` heming.zhao--- via Ocfs2-devel
2022-04-10 12:00 ` Joseph Qi via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 2/5] ocfs2: change return type of ocfs2_resmap_init Heming Zhao via Ocfs2-devel
2022-04-09 13:14 ` Joseph Qi via Ocfs2-devel
2022-04-09 16:29 ` heming.zhao--- via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 3/5] ocfs2: ocfs2_initialize_super does cleanup job before return error Heming Zhao via Ocfs2-devel
2022-04-09 13:30 ` Joseph Qi via Ocfs2-devel
2022-04-09 16:47 ` heming.zhao--- via Ocfs2-devel
2022-04-11 1:56 ` Joseph Qi via Ocfs2-devel
2022-04-11 2:09 ` heming.zhao--- via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 4/5] ocfs2: ocfs2_mount_volume " Heming Zhao via Ocfs2-devel
2022-04-09 13:46 ` Joseph Qi via Ocfs2-devel
2022-04-10 3:58 ` heming.zhao--- via Ocfs2-devel
2022-04-12 8:22 ` Heming Zhao via Ocfs2-devel
2022-04-12 11:54 ` Joseph Qi via Ocfs2-devel
2022-04-12 16:46 ` Heming Zhao via Ocfs2-devel
2022-04-08 10:30 ` [Ocfs2-devel] [PATCH v1 5/5] ocfs2: rewrite error handling of ocfs2_fill_super Heming Zhao via Ocfs2-devel
2022-04-09 14:02 ` Joseph Qi via Ocfs2-devel
2022-04-10 10:25 ` heming.zhao--- via Ocfs2-devel
2022-04-12 8:49 ` Heming Zhao via Ocfs2-devel
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.