ocfs2-devel.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
* [Ocfs2-devel] [PATCH v2] ocfs2: fix kernel crash after mounting when dlm doesn't ready
@ 2022-03-10 10:58 Heming Zhao via Ocfs2-devel
  2022-03-11  2:33 ` Joseph Qi via Ocfs2-devel
  0 siblings, 1 reply; 3+ messages in thread
From: Heming Zhao via Ocfs2-devel @ 2022-03-10 10:58 UTC (permalink / raw)
  To: ocfs2-devel, joseph.qi

Call trace:

  ocfs2: Registered cluster interface user
  dlm: no local IP address has been set
  dlm: cannot start dlm lowcomms -107
  (mount.ocfs2,2225,0):ocfs2_dlm_init:3355 ERROR: status = -107
  (mount.ocfs2,2225,0):ocfs2_mount_volume:1817 ERROR: status = -107
  (mount.ocfs2,2225,0):ocfs2_fill_super:1186 ERROR: status = -107
  BUG: kernel NULL pointer dereference, address: 0000000000000030
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 0 P4D 0
  Oops: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 0 PID: 2225 Comm: mount.ocfs2 Not tainted 5.16.2-1-default #1 openSUSE Tumbleweed b40...e84
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ...
  RIP: 0010:ocfs2_clear_inode+0x2e9/0x720 [ocfs2]
  ...
  Call Trace:
   <TASK>
   ? ocfs2_evict_inode+0x1fe/0x630 [ocfs2 411bc..281]
   evict+0xc0/0x1c0
   ocfs2_release_system_inodes+0x21/0xc0 [ocfs2 411bc..281]
   ocfs2_dismount_volume+0x10b/0x2d0 [ocfs2 411bc..281]
   ocfs2_fill_super+0xaf/0x19e0 [ocfs2 411bc..281]
   mount_bdev+0x182/0x1b0
   ? ocfs2_initialize_super.isra.0+0xf50/0xf50 [ocfs2 411bc..281]
   legacy_get_tree+0x24/0x40
   vfs_get_tree+0x22/0xb0
   path_mount+0x465/0xac0
   __x64_sys_mount+0x103/0x140
   do_syscall_64+0x59/0x80
   ? syscall_exit_to_user_mode+0x18/0x40
   ? do_syscall_64+0x69/0x80
   ? syscall_exit_to_user_mode+0x18/0x40
   ? do_syscall_64+0x69/0x80
   ? exc_page_fault+0x68/0x150
   entry_SYSCALL_64_after_hwframe+0x44/0xae

How to trigger:

  tb-ocfs1 # dd if=/dev/zero of=/dev/vdb bs=1M count=20 oflag=direct
  tb-ocfs1 # mkfs.ocfs2 --cluster-stack=pcmk -N 4 /dev/vdb --cluster-name=ocfstst

             == there is no dlm running ==
  tb-ocfs1 # mount -t ocfs2 /dev/vdb /mnt
             == kernel crash ==

Analysis:

After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
Journal init later than before, it makes NULL pointer access in free
routine.

ocfs2_fill_super
 + ocfs2_mount_volume
 |  ocfs2_dlm_init //failed, osb->journal still NULL.
 + ...
 + ocfs2_dismount_volume
    ocfs2_release_system_inodes
      ...
       evict
        ...
         ocfs2_clear_inode
          ocfs2_checkpoint_inode
           ocfs2_ci_fully_checkpointed
            time_after(journal->j_trans_id, ci->ci_last_trans)
             + journal is empty, crash!

The fix should assess osb->journal before journal operation. And the fix
should be only related with abnormal case before ocfs2_journal_init().

Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
---
v2: revise commit log
    add checking in ocfs2_dismount_volume()
---
 fs/ocfs2/inode.c   | 3 ++-
 fs/ocfs2/journal.h | 2 +-
 fs/ocfs2/super.c   | 7 ++++---
 3 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 6c2411c2afcf..3826ab7eab3e 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -1205,7 +1205,8 @@ static void ocfs2_clear_inode(struct inode *inode)
 	 * the journal is flushed before journal shutdown. Thus it is safe to
 	 * have inodes get cleaned up after journal shutdown.
 	 */
-	jbd2_journal_release_jbd_inode(osb->journal->j_journal,
+	if (osb->journal)
+		jbd2_journal_release_jbd_inode(osb->journal->j_journal,
 				       &oi->ip_jinode);
 }
 
diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
index 8dcb2f2cadbc..1610d49b4112 100644
--- a/fs/ocfs2/journal.h
+++ b/fs/ocfs2/journal.h
@@ -189,7 +189,7 @@ static inline void ocfs2_checkpoint_inode(struct inode *inode)
 {
 	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
 
-	if (ocfs2_mount_local(osb))
+	if (!osb->journal || ocfs2_mount_local(osb))
 		return;
 
 	if (!ocfs2_ci_fully_checkpointed(INODE_CACHE(inode))) {
diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
index 2772dec9dcea..01f1ab7c8de8 100644
--- a/fs/ocfs2/super.c
+++ b/fs/ocfs2/super.c
@@ -1886,9 +1886,10 @@ static void ocfs2_dismount_volume(struct super_block *sb, int mnt_err)
 	/* Wait for worker to be done with the work structure in osb */
 	cancel_work_sync(&osb->dquot_drop_work);
 
-	ocfs2_shutdown_local_alloc(osb);
-
-	ocfs2_truncate_log_shutdown(osb);
+	if (osb->journal) {
+		ocfs2_shutdown_local_alloc(osb);
+		ocfs2_truncate_log_shutdown(osb);
+	}
 
 	/* This will disable recovery and flush any recovery work. */
 	ocfs2_recovery_exit(osb);
-- 
2.34.1


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2] ocfs2: fix kernel crash after mounting when dlm doesn't ready
  2022-03-10 10:58 [Ocfs2-devel] [PATCH v2] ocfs2: fix kernel crash after mounting when dlm doesn't ready Heming Zhao via Ocfs2-devel
@ 2022-03-11  2:33 ` Joseph Qi via Ocfs2-devel
  2022-03-11  8:10   ` heming.zhao--- via Ocfs2-devel
  0 siblings, 1 reply; 3+ messages in thread
From: Joseph Qi via Ocfs2-devel @ 2022-03-11  2:33 UTC (permalink / raw)
  To: Heming Zhao, ocfs2-devel

Hi,
To prevent the crash, I think these changes are fine.
But it looks wried to spread the check of ocfs2 journal.
Can we intialize the ocfs2 journal much earlier?

Thanks,
Joseph

On 3/10/22 6:58 PM, Heming Zhao wrote:
> Call trace:
> 
>   ocfs2: Registered cluster interface user
>   dlm: no local IP address has been set
>   dlm: cannot start dlm lowcomms -107
>   (mount.ocfs2,2225,0):ocfs2_dlm_init:3355 ERROR: status = -107
>   (mount.ocfs2,2225,0):ocfs2_mount_volume:1817 ERROR: status = -107
>   (mount.ocfs2,2225,0):ocfs2_fill_super:1186 ERROR: status = -107
>   BUG: kernel NULL pointer dereference, address: 0000000000000030
>   #PF: supervisor read access in kernel mode
>   #PF: error_code(0x0000) - not-present page
>   PGD 0 P4D 0
>   Oops: 0000 [#1] PREEMPT SMP NOPTI
>   CPU: 0 PID: 2225 Comm: mount.ocfs2 Not tainted 5.16.2-1-default #1 openSUSE Tumbleweed b40...e84
>   Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ...
>   RIP: 0010:ocfs2_clear_inode+0x2e9/0x720 [ocfs2]
>   ...
>   Call Trace:
>    <TASK>
>    ? ocfs2_evict_inode+0x1fe/0x630 [ocfs2 411bc..281]
>    evict+0xc0/0x1c0
>    ocfs2_release_system_inodes+0x21/0xc0 [ocfs2 411bc..281]
>    ocfs2_dismount_volume+0x10b/0x2d0 [ocfs2 411bc..281]
>    ocfs2_fill_super+0xaf/0x19e0 [ocfs2 411bc..281]
>    mount_bdev+0x182/0x1b0
>    ? ocfs2_initialize_super.isra.0+0xf50/0xf50 [ocfs2 411bc..281]
>    legacy_get_tree+0x24/0x40
>    vfs_get_tree+0x22/0xb0
>    path_mount+0x465/0xac0
>    __x64_sys_mount+0x103/0x140
>    do_syscall_64+0x59/0x80
>    ? syscall_exit_to_user_mode+0x18/0x40
>    ? do_syscall_64+0x69/0x80
>    ? syscall_exit_to_user_mode+0x18/0x40
>    ? do_syscall_64+0x69/0x80
>    ? exc_page_fault+0x68/0x150
>    entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> How to trigger:
> 
>   tb-ocfs1 # dd if=/dev/zero of=/dev/vdb bs=1M count=20 oflag=direct
>   tb-ocfs1 # mkfs.ocfs2 --cluster-stack=pcmk -N 4 /dev/vdb --cluster-name=ocfstst
> 
>              == there is no dlm running ==
>   tb-ocfs1 # mount -t ocfs2 /dev/vdb /mnt
>              == kernel crash ==
> 
> Analysis:
> 
> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
> Journal init later than before, it makes NULL pointer access in free
> routine.
> 
> ocfs2_fill_super
>  + ocfs2_mount_volume
>  |  ocfs2_dlm_init //failed, osb->journal still NULL.
>  + ...
>  + ocfs2_dismount_volume
>     ocfs2_release_system_inodes
>       ...
>        evict
>         ...
>          ocfs2_clear_inode
>           ocfs2_checkpoint_inode
>            ocfs2_ci_fully_checkpointed
>             time_after(journal->j_trans_id, ci->ci_last_trans)
>              + journal is empty, crash!
> 
> The fix should assess osb->journal before journal operation. And the fix
> should be only related with abnormal case before ocfs2_journal_init().
> 
> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
> ---
> v2: revise commit log
>     add checking in ocfs2_dismount_volume()
> ---
>  fs/ocfs2/inode.c   | 3 ++-
>  fs/ocfs2/journal.h | 2 +-
>  fs/ocfs2/super.c   | 7 ++++---
>  3 files changed, 7 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> index 6c2411c2afcf..3826ab7eab3e 100644
> --- a/fs/ocfs2/inode.c
> +++ b/fs/ocfs2/inode.c
> @@ -1205,7 +1205,8 @@ static void ocfs2_clear_inode(struct inode *inode)
>  	 * the journal is flushed before journal shutdown. Thus it is safe to
>  	 * have inodes get cleaned up after journal shutdown.
>  	 */
> -	jbd2_journal_release_jbd_inode(osb->journal->j_journal,
> +	if (osb->journal)
> +		jbd2_journal_release_jbd_inode(osb->journal->j_journal,
>  				       &oi->ip_jinode);
>  }
>  
> diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
> index 8dcb2f2cadbc..1610d49b4112 100644
> --- a/fs/ocfs2/journal.h
> +++ b/fs/ocfs2/journal.h
> @@ -189,7 +189,7 @@ static inline void ocfs2_checkpoint_inode(struct inode *inode)
>  {
>  	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>  
> -	if (ocfs2_mount_local(osb))
> +	if (!osb->journal || ocfs2_mount_local(osb))
>  		return;
>  
>  	if (!ocfs2_ci_fully_checkpointed(INODE_CACHE(inode))) {
> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
> index 2772dec9dcea..01f1ab7c8de8 100644
> --- a/fs/ocfs2/super.c
> +++ b/fs/ocfs2/super.c
> @@ -1886,9 +1886,10 @@ static void ocfs2_dismount_volume(struct super_block *sb, int mnt_err)
>  	/* Wait for worker to be done with the work structure in osb */
>  	cancel_work_sync(&osb->dquot_drop_work);
>  
> -	ocfs2_shutdown_local_alloc(osb);
> -
> -	ocfs2_truncate_log_shutdown(osb);
> +	if (osb->journal) {
> +		ocfs2_shutdown_local_alloc(osb);
> +		ocfs2_truncate_log_shutdown(osb);
> +	}
>  
>  	/* This will disable recovery and flush any recovery work. */
>  	ocfs2_recovery_exit(osb);

_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [Ocfs2-devel] [PATCH v2] ocfs2: fix kernel crash after mounting when dlm doesn't ready
  2022-03-11  2:33 ` Joseph Qi via Ocfs2-devel
@ 2022-03-11  8:10   ` heming.zhao--- via Ocfs2-devel
  0 siblings, 0 replies; 3+ messages in thread
From: heming.zhao--- via Ocfs2-devel @ 2022-03-11  8:10 UTC (permalink / raw)
  To: Joseph Qi, ocfs2-devel

On 3/11/22 10:33, Joseph Qi wrote:
> Hi,
> To prevent the crash, I think these changes are fine.
> But it looks wried to spread the check of ocfs2 journal.

Yes, I have same feeling when I'm making patch.

> Can we intialize the ocfs2 journal much earlier?
> 

calling it earlier only makes the gap more narrow not disappear.
(another way is to revert the commit da5e7c87827e8, it will make author unhappy)

I'm just learning ocfs2 code, don't have too much knowledge.
please forgive me for below foolish writing.

IMO all dirty data is useless on mounting phase, and can be safely dropped.

(or IOW, all the dirty data before journal init should be dropped directly)
I have tried to mark readonly when error happen, want to use readonly to skip
write operations but still crash.


I checked __ext4_fill_super, who calls ext4_load_journal lately too, but it uses
specific free steps before function return. Maybe we could do the same clean job
in ocfs2_fill_super or ocfs2_mount_volume(I prefer this func). This idea looks
need to write some new functions.

Thanks
> 
> On 3/10/22 6:58 PM, Heming Zhao wrote:
>> Call trace:
>>
>>    ocfs2: Registered cluster interface user
>>    dlm: no local IP address has been set
>>    dlm: cannot start dlm lowcomms -107
>>    (mount.ocfs2,2225,0):ocfs2_dlm_init:3355 ERROR: status = -107
>>    (mount.ocfs2,2225,0):ocfs2_mount_volume:1817 ERROR: status = -107
>>    (mount.ocfs2,2225,0):ocfs2_fill_super:1186 ERROR: status = -107
>>    BUG: kernel NULL pointer dereference, address: 0000000000000030
>>    #PF: supervisor read access in kernel mode
>>    #PF: error_code(0x0000) - not-present page
>>    PGD 0 P4D 0
>>    Oops: 0000 [#1] PREEMPT SMP NOPTI
>>    CPU: 0 PID: 2225 Comm: mount.ocfs2 Not tainted 5.16.2-1-default #1 openSUSE Tumbleweed b40...e84
>>    Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS ...
>>    RIP: 0010:ocfs2_clear_inode+0x2e9/0x720 [ocfs2]
>>    ...
>>    Call Trace:
>>     <TASK>
>>     ? ocfs2_evict_inode+0x1fe/0x630 [ocfs2 411bc..281]
>>     evict+0xc0/0x1c0
>>     ocfs2_release_system_inodes+0x21/0xc0 [ocfs2 411bc..281]
>>     ocfs2_dismount_volume+0x10b/0x2d0 [ocfs2 411bc..281]
>>     ocfs2_fill_super+0xaf/0x19e0 [ocfs2 411bc..281]
>>     mount_bdev+0x182/0x1b0
>>     ? ocfs2_initialize_super.isra.0+0xf50/0xf50 [ocfs2 411bc..281]
>>     legacy_get_tree+0x24/0x40
>>     vfs_get_tree+0x22/0xb0
>>     path_mount+0x465/0xac0
>>     __x64_sys_mount+0x103/0x140
>>     do_syscall_64+0x59/0x80
>>     ? syscall_exit_to_user_mode+0x18/0x40
>>     ? do_syscall_64+0x69/0x80
>>     ? syscall_exit_to_user_mode+0x18/0x40
>>     ? do_syscall_64+0x69/0x80
>>     ? exc_page_fault+0x68/0x150
>>     entry_SYSCALL_64_after_hwframe+0x44/0xae
>>
>> How to trigger:
>>
>>    tb-ocfs1 # dd if=/dev/zero of=/dev/vdb bs=1M count=20 oflag=direct
>>    tb-ocfs1 # mkfs.ocfs2 --cluster-stack=pcmk -N 4 /dev/vdb --cluster-name=ocfstst
>>
>>               == there is no dlm running ==
>>    tb-ocfs1 # mount -t ocfs2 /dev/vdb /mnt
>>               == kernel crash ==
>>
>> Analysis:
>>
>> After commit da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown"),
>> Journal init later than before, it makes NULL pointer access in free
>> routine.
>>
>> ocfs2_fill_super
>>   + ocfs2_mount_volume
>>   |  ocfs2_dlm_init //failed, osb->journal still NULL.
>>   + ...
>>   + ocfs2_dismount_volume
>>      ocfs2_release_system_inodes
>>        ...
>>         evict
>>          ...
>>           ocfs2_clear_inode
>>            ocfs2_checkpoint_inode
>>             ocfs2_ci_fully_checkpointed
>>              time_after(journal->j_trans_id, ci->ci_last_trans)
>>               + journal is empty, crash!
>>
>> The fix should assess osb->journal before journal operation. And the fix
>> should be only related with abnormal case before ocfs2_journal_init().
>>
>> Fixes: da5e7c87827e8 ("ocfs2: cleanup journal init and shutdown")
>> Signed-off-by: Heming Zhao <heming.zhao@suse.com>
>> ---
>> v2: revise commit log
>>      add checking in ocfs2_dismount_volume()
>> ---
>>   fs/ocfs2/inode.c   | 3 ++-
>>   fs/ocfs2/journal.h | 2 +-
>>   fs/ocfs2/super.c   | 7 ++++---
>>   3 files changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
>> index 6c2411c2afcf..3826ab7eab3e 100644
>> --- a/fs/ocfs2/inode.c
>> +++ b/fs/ocfs2/inode.c
>> @@ -1205,7 +1205,8 @@ static void ocfs2_clear_inode(struct inode *inode)
>>   	 * the journal is flushed before journal shutdown. Thus it is safe to
>>   	 * have inodes get cleaned up after journal shutdown.
>>   	 */
>> -	jbd2_journal_release_jbd_inode(osb->journal->j_journal,
>> +	if (osb->journal)
>> +		jbd2_journal_release_jbd_inode(osb->journal->j_journal,
>>   				       &oi->ip_jinode);
>>   }
>>   
>> diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
>> index 8dcb2f2cadbc..1610d49b4112 100644
>> --- a/fs/ocfs2/journal.h
>> +++ b/fs/ocfs2/journal.h
>> @@ -189,7 +189,7 @@ static inline void ocfs2_checkpoint_inode(struct inode *inode)
>>   {
>>   	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
>>   
>> -	if (ocfs2_mount_local(osb))
>> +	if (!osb->journal || ocfs2_mount_local(osb))
>>   		return;
>>   
>>   	if (!ocfs2_ci_fully_checkpointed(INODE_CACHE(inode))) {
>> diff --git a/fs/ocfs2/super.c b/fs/ocfs2/super.c
>> index 2772dec9dcea..01f1ab7c8de8 100644
>> --- a/fs/ocfs2/super.c
>> +++ b/fs/ocfs2/super.c
>> @@ -1886,9 +1886,10 @@ static void ocfs2_dismount_volume(struct super_block *sb, int mnt_err)
>>   	/* Wait for worker to be done with the work structure in osb */
>>   	cancel_work_sync(&osb->dquot_drop_work);
>>   
>> -	ocfs2_shutdown_local_alloc(osb);
>> -
>> -	ocfs2_truncate_log_shutdown(osb);
>> +	if (osb->journal) {
>> +		ocfs2_shutdown_local_alloc(osb);
>> +		ocfs2_truncate_log_shutdown(osb);
>> +	}
>>   
>>   	/* This will disable recovery and flush any recovery work. */
>>   	ocfs2_recovery_exit(osb);
> 


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-03-11  8:10 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-10 10:58 [Ocfs2-devel] [PATCH v2] ocfs2: fix kernel crash after mounting when dlm doesn't ready Heming Zhao via Ocfs2-devel
2022-03-11  2:33 ` Joseph Qi via Ocfs2-devel
2022-03-11  8:10   ` heming.zhao--- via Ocfs2-devel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).