All of lore.kernel.org
 help / color / mirror / Atom feed
From: wangyan <wangyan122@huawei.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans()
Date: Thu, 9 Jan 2020 14:26:10 +0800	[thread overview]
Message-ID: <027dc960-8904-8bcb-b778-a496172110e1@huawei.com> (raw)
In-Reply-To: <HK0PR02MB37809C375533BE98F5FF2D6FD5390@HK0PR02MB3780.apcprd02.prod.outlook.com>



On 2020/1/9 9:57, Changwei Ge wrote:
>
> On 1/8/20 5:23 PM, wangyan wrote:
>> I found a NULL pointer dereference in ocfs2_update_inode_fsync_trans(),
>> handle->h_transaction may be NULL in this situation:
>> ocfs2_file_write_iter
>>     ->__generic_file_write_iter
>>         ->generic_perform_write
>>           ->ocfs2_write_begin
>>             ->ocfs2_write_begin_nolock
>>               ->ocfs2_write_cluster_by_desc
>>                 ->ocfs2_write_cluster
>>                   ->ocfs2_mark_extent_written
>>                     ->ocfs2_change_extent_flag
>>                       ->ocfs2_split_extent
>>                         ->ocfs2_try_to_merge_extent
>>                           ->ocfs2_extend_rotate_transaction
>>                             ->ocfs2_extend_trans
>>                               ->jbd2_journal_restart
>>                                 ->jbd2__journal_restart
>>                                   // handle->h_transaction is NULL here
>>                                   ->handle->h_transaction = NULL;
>>                                   ->start_this_handle
>>                                     /* journal aborted due to storage
>>                                        network disconnection, return error */
>>                                     ->return -EROFS;
>>                            /* line 3806 in ocfs2_try_to_merge_extent (),
>>                               it will ignore ret error. */
>>                           ->ret = 0;
>>           ->...
>>           ->ocfs2_write_end
>>             ->ocfs2_write_end_nolock
>>               ->ocfs2_update_inode_fsync_trans
>>                 // NULL pointer dereference
>>                 ->oi->i_sync_tid = handle->h_transaction->t_tid;
>>
>> The information of NULL pointer dereference as follows:
>>       JBD2: Detected IO errors while flushing file data on dm-11-45
>>       Aborting journal on device dm-11-45.
>>       JBD2: Error -5 detected when updating journal superblock for dm-11-45.
>>       (dd,22081,3):ocfs2_extend_trans:474 ERROR: status = -30
>>       (dd,22081,3):ocfs2_try_to_merge_extent:3877 ERROR: status = -30
>>       Unable to handle kernel NULL pointer dereference at
>>       virtual address 0000000000000008
>>       Mem abort info:
>>         ESR = 0x96000004
>>         Exception class = DABT (current EL), IL = 32 bits
>>         SET = 0, FnV = 0
>>         EA = 0, S1PTW = 0
>>       Data abort info:
>>         ISV = 0, ISS = 0x00000004
>>         CM = 0, WnR = 0
>>       user pgtable: 4k pages, 48-bit VAs, pgdp = 00000000e74e1338
>>       [0000000000000008] pgd=0000000000000000
>>       Internal error: Oops: 96000004 [#1] SMP
>>       Process dd (pid: 22081, stack limit = 0x00000000584f35a9)
>>       CPU: 3 PID: 22081 Comm: dd Kdump: loaded
>>       Hardware name: Huawei TaiShan 2280 V2/BC82AMDD, BIOS 0.98 08/25/2019
>>       pstate: 60400009 (nZCv daif +PAN -UAO)
>>       pc : ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
>>       lr : ocfs2_write_end_nolock+0x2a0/0x550 [ocfs2]
>>       sp : ffff0000459fba70
>>       x29: ffff0000459fba70 x28: 0000000000000000
>>       x27: ffff807ccf7f1000 x26: 0000000000000001
>>       x25: ffff807bdff57970 x24: ffff807caf1d4000
>>       x23: ffff807cc79e9000 x22: 0000000000001000
>>       x21: 000000006c6cd000 x20: ffff0000091d9000
>>       x19: ffff807ccb239db0 x18: ffffffffffffffff
>>       x17: 000000000000000e x16: 0000000000000007
>>       x15: ffff807c5e15bd78 x14: 0000000000000000
>>       x13: 0000000000000000 x12: 0000000000000000
>>       x11: 0000000000000000 x10: 0000000000000001
>>       x9 : 0000000000000228 x8 : 000000000000000c
>>       x7 : 0000000000000fff x6 : ffff807a308ed6b0
>>       x5 : ffff7e01f10967c0 x4 : 0000000000000018
>>       x3 : d0bc661572445600 x2 : 0000000000000000
>>       x1 : 000000001b2e0200 x0 : 0000000000000000
>>       Call trace:
>>        ocfs2_write_end_nolock+0x2b8/0x550 [ocfs2]
>>        ocfs2_write_end+0x4c/0x80 [ocfs2]
>>        generic_perform_write+0x108/0x1a8
>>        __generic_file_write_iter+0x158/0x1c8
>>        ocfs2_file_write_iter+0x668/0x950 [ocfs2]
>>        __vfs_write+0x11c/0x190
>>        vfs_write+0xac/0x1c0
>>        ksys_write+0x6c/0xd8
>>        __arm64_sys_write+0x24/0x30
>>        el0_svc_common+0x78/0x130
>>        el0_svc_handler+0x38/0x78
>>        el0_svc+0x8/0xc
>>
>> To prevent NULL pointer dereference in this situation, we use
>> is_handle_aborted() before using handle->h_transaction->t_tid.
>>
>> Signed-off-by: Yan Wang <wangyan122@huawei.com>
>> Reviewed-by: Jun Piao <piaojun@huawei.com>
>> ---
>>    fs/ocfs2/journal.h | 8 +++++---
>>    fs/ocfs2/namei.c   | 3 +--
>>    2 files changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/ocfs2/journal.h b/fs/ocfs2/journal.h
>> index 3103ba7f97a2..bfe611ed1b1d 100644
>> --- a/fs/ocfs2/journal.h
>> +++ b/fs/ocfs2/journal.h
>> @@ -597,9 +597,11 @@ static inline void
>> ocfs2_update_inode_fsync_trans(handle_t *handle,
>>    {
>>    	struct ocfs2_inode_info *oi = OCFS2_I(inode);
>>
>> -	oi->i_sync_tid = handle->h_transaction->t_tid;
>> -	if (datasync)
>> -		oi->i_datasync_tid = handle->h_transaction->t_tid;
>> +	if (!is_handle_aborted(handle)) {
>> +		oi->i_sync_tid = handle->h_transaction->t_tid;
>> +		if (datasync)
>> +			oi->i_datasync_tid = handle->h_transaction->t_tid;
>> +	}
>
>
> I don't think your way can fix the issue you reported completely.
>
> Even you check if the journal is ABORTED or not, you still face a race
> causing accessing NULL h_transaction.
>
> Otherwise, you need synchronization mechanism help.
Just one process, and handle is not shared by other processes, no race here.
>
> Besides, if journal is aborted, ocfs2 won't fence the machine by resetting?
Even journal is not aborted, it still has other wrong branch in 
start_this_handle(), which will return error before assign value to 
handle->h_transaction.
>
>
> Thanks,
>
> Changwei
>
>
>>    }
>>
>>    #endif /* OCFS2_JOURNAL_H */
>> diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
>> index 8ea51cf27b97..da65251ef815 100644
>> --- a/fs/ocfs2/namei.c
>> +++ b/fs/ocfs2/namei.c
>> @@ -586,8 +586,7 @@ static int __ocfs2_mknod_locked(struct inode *dir,
>>    			mlog_errno(status);
>>    	}
>>
>> -	oi->i_sync_tid = handle->h_transaction->t_tid;
>> -	oi->i_datasync_tid = handle->h_transaction->t_tid;
>> +	ocfs2_update_inode_fsync_trans(handle, inode, 1);
>>
>>    leave:
>>    	if (status < 0) {

      reply	other threads:[~2020-01-09  6:26 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-08  9:23 [Ocfs2-devel] [PATCH] ocfs2: fix a NULL pointer dereference when call ocfs2_update_inode_fsync_trans() wangyan
2020-01-08 11:31 ` Joseph Qi
2020-01-08 12:14   ` wangyan
2020-01-09  1:25     ` Joseph Qi
2020-01-09  1:37       ` wangyan
2020-01-09  1:57 ` Changwei Ge
2020-01-09  6:26   ` wangyan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=027dc960-8904-8bcb-b778-a496172110e1@huawei.com \
    --to=wangyan122@huawei.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.