From: Wengang Wang <wen.gang.wang@oracle.com>
To: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: ocfs2-devel ML <ocfs2-devel@oss.oracle.com>,
"GHe@suse.com" <GHe@suse.com>
Subject: Re: [Ocfs2-devel] [PATCH V2] ocfs2: fix deadlock between setattr and dio_end_io_write
Date: Mon, 5 Apr 2021 16:42:06 +0000 [thread overview]
Message-ID: <91ECDE2E-8A1B-4A91-B425-E8157081C083@oracle.com> (raw)
In-Reply-To: <994870d6-5b0e-3a22-6180-9beb501a59b0@linux.alibaba.com>
> On Apr 4, 2021, at 4:25 AM, Joseph Qi <joseph.qi@linux.alibaba.com> wrote:
>
>
>
> On 4/3/21 1:13 AM, Wengang Wang wrote:
>> The following deadlock is detected:
>>
>> truncate -> setattr path is waiting for pending direct IO to be done (
>> inode->i_dio_count become zero) with inode->i_rwsem held (down_write).
>>
>> PID: 14827 TASK: ffff881686a9af80 CPU: 20 COMMAND: "ora_p005_hrltd9"
>> #0 [ffffc9000bcf3c08] __schedule at ffffffff818667cc
>> #1 [ffffc9000bcf3ca0] schedule at ffffffff81866de6
>> #2 [ffffc9000bcf3cb8] inode_dio_wait at ffffffff812a2d04
>> #3 [ffffc9000bcf3d28] ocfs2_setattr at ffffffffc05f322e [ocfs2]
>> #4 [ffffc9000bcf3e18] notify_change at ffffffff812a5a09
>> #5 [ffffc9000bcf3e60] do_truncate at ffffffff812808f5
>> #6 [ffffc9000bcf3ed8] do_sys_ftruncate.constprop.18 at ffffffff81280cf2
>> #7 [ffffc9000bcf3f18] sys_ftruncate at ffffffff81280d8e
>> #8 [ffffc9000bcf3f28] do_syscall_64 at ffffffff81003949
>> #9 [ffffc9000bcf3f50] entry_SYSCALL_64_after_hwframe at ffffffff81a001ad
>>
>> dio completion path is going to complete one direct IO (decrement
>> inode->i_dio_count), but before that it hang at locking inode->i_rwsem.
>>
>> #0 [ffffc90009b47b40] __schedule+700 at ffffffff818667cc
>> #1 [ffffc90009b47bd8] schedule+54 at ffffffff81866de6
>> #2 [ffffc90009b47bf0] rwsem_down_write_failed+536 at ffffffff8186aa28
>> #3 [ffffc90009b47c88] call_rwsem_down_write_failed+23 at ffffffff8185a1b7
>> #4 [ffffc90009b47cd0] down_write+45 at ffffffff81869c9d
>> #5 [ffffc90009b47ce8] ocfs2_dio_end_io_write+180 at ffffffffc05d5444 [ocfs2]
>> #6 [ffffc90009b47dd8] ocfs2_dio_end_io+85 at ffffffffc05d5a85 [ocfs2]
>> #7 [ffffc90009b47e18] dio_complete+140 at ffffffff812c873c
>> #8 [ffffc90009b47e50] dio_aio_complete_work+25 at ffffffff812c89f9
>> #9 [ffffc90009b47e60] process_one_work+361 at ffffffff810b1889
>> #10 [ffffc90009b47ea8] worker_thread+77 at ffffffff810b233d
>> #11 [ffffc90009b47f08] kthread+261 at ffffffff810b7fd5
>> #12 [ffffc90009b47f50] ret_from_fork+62 at ffffffff81a0035e
>>
>> Thus above forms ABBA deadlock. The same deadlock was mentioned in
>> upstream commit 28f5a8a7c033cbf3e32277f4cc9c6afd74f05300. well, it seems
>> that that commit just removed cluster lock (the victim of above dead
>> lock) from the ABBA deadlock party.
>>
>> End-user visible effects:
>> Process hang in truncate -> ocfs2_setattr path and other processes hang at
>> ocfs2_dio_end_io_write path.
>>
>> This is to fix the deadlock its self. It removes inode_lock() call from dio
>> completion path to remove the deadlock and add ip_alloc_sem lock in setattr
>> path to synchronize the inode modifications.
>>
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
>> ---
>> v1 -> v2: remove the "had_alloc_lock" as suggested.
>> ---
>> fs/ocfs2/aops.c | 11 +----------
>> fs/ocfs2/file.c | 8 ++++++--
>> 2 files changed, 7 insertions(+), 12 deletions(-)
>>
>> diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
>> index 3bfb4147895a..ad20403b383f 100644
>> --- a/fs/ocfs2/aops.c
>> +++ b/fs/ocfs2/aops.c
>> @@ -2295,7 +2295,7 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
>> struct ocfs2_alloc_context *meta_ac = NULL;
>> handle_t *handle = NULL;
>> loff_t end = offset + bytes;
>> - int ret = 0, credits = 0, locked = 0;
>> + int ret = 0, credits = 0;
>>
>> ocfs2_init_dealloc_ctxt(&dealloc);
>>
>> @@ -2306,13 +2306,6 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
>> !dwc->dw_orphaned)
>> goto out;
>>
>> - /* ocfs2_file_write_iter will get i_mutex, so we need not lock if we
>> - * are in that context. */
>> - if (dwc->dw_writer_pid != task_pid_nr(current)) {
>> - inode_lock(inode);
>> - locked = 1;
>> - }
>> -
>> ret = ocfs2_inode_lock(inode, &di_bh, 1);
>> if (ret < 0) {
>> mlog_errno(ret);
>> @@ -2393,8 +2386,6 @@ static int ocfs2_dio_end_io_write(struct inode *inode,
>> if (meta_ac)
>> ocfs2_free_alloc_context(meta_ac);
>> ocfs2_run_deallocs(osb, &dealloc);
>> - if (locked)
>> - inode_unlock(inode);
>> ocfs2_dio_free_write_ctx(inode, dwc);
>>
>> return ret;
>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>> index 85979e2214b3..8880071ee4ee 100644
>> --- a/fs/ocfs2/file.c
>> +++ b/fs/ocfs2/file.c
>> @@ -1244,22 +1244,24 @@ int ocfs2_setattr(struct dentry *dentry, struct iattr *attr)
>> goto bail_unlock;
>> }
>> }
>> + down_write(&OCFS2_I(inode)->ip_alloc_sem);
>> handle = ocfs2_start_trans(osb, OCFS2_INODE_UPDATE_CREDITS +
>> 2 * ocfs2_quota_trans_credits(sb));
>> if (IS_ERR(handle)) {
>> status = PTR_ERR(handle);
>> mlog_errno(status);
>> - goto bail_unlock;
>> + goto bail_unlock_alloc;
>> }
>> status = __dquot_transfer(inode, transfer_to);
>> if (status < 0)
>> goto bail_commit;
>> } else {
>> + down_write(&OCFS2_I(inode)->ip_alloc_sem);
>
> Why not do down_write() just before if clause?
Yes, down_write() just before if clause works too. I just wanted lock as less/short as possible.
> Anyway, it also looks fine to me. So with or without that change,
>
> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Thanks Joseph for review!
Wengang
_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel
prev parent reply other threads:[~2021-04-05 16:42 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-02 17:13 [Ocfs2-devel] [PATCH V2] ocfs2: fix deadlock between setattr and dio_end_io_write Wengang Wang
2021-04-04 11:25 ` Joseph Qi
2021-04-05 16:42 ` Wengang Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=91ECDE2E-8A1B-4A91-B425-E8157081C083@oracle.com \
--to=wen.gang.wang@oracle.com \
--cc=GHe@suse.com \
--cc=joseph.qi@linux.alibaba.com \
--cc=ocfs2-devel@oss.oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).