All of lore.kernel.org
 help / color / mirror / Atom feed
From: Changwei Ge <ge.changwei@h3c.com>
To: ocfs2-devel@oss.oracle.com
Subject: [Ocfs2-devel] [PATCH] ocfs2: should wait dio before inode lock in ocfs2_setattr()
Date: Sat, 28 Oct 2017 08:12:16 +0000	[thread overview]
Message-ID: <63ADC13FD55D6546B7DECE290D39E373CED72176@H3CMLB14-EX.srv.huawei-3com.com> (raw)
In-Reply-To: 59F3ECC8.1030509@huawei.com

Hi Alex,
On 2017/10/28 10:35, alex chen wrote:
> Hi Changwei,
> 
> Thanks for you reply.
> 
> On 2017/10/27 18:21, Changwei Ge wrote:
>> Hi Alex,
>>
>> Thanks for reporting.
>> I probably get your point. You mean that for a lock resource(say A), it
>> is used to protect metadata changing  among nodes in cluster.
>>
>> Unfortunately, it was marks as BLOCKED since it was granted with a EX
>> lock, and the lock can't be unblocked since it has more or equal to one
>> ::ex_holder(s), furthermore, since process 1 is waiting for all inflight
>> dio accomplishment, it won't give up its ownership of lock source A.
>>
>> Thus, hang, right?
> 
> Yes, I'm glad you can understand this.
> 
>>
>>   From code reviewing, I admit that the hang situation does exit.
>>
>> But as for your patch, how can you guarantee no more bio will be issued
>> from other nodes in cluster?
>>
> 
> First of all, we use the inode_lock() in do_truncate() to prevent another bio to
> be issued from this node.
> Furthermore, we use the ocfs2_rw_lock() and ocfs2_inode_lock() in ocfs2_setattr()
> to guarantee no more bio will be issued from the other nodes in this cluster.
Thanks for your particular elaboration.
I think this patch makes sense.
Acked.

Thanks,
Changwei

> 
> Thanks,
> Alex
> 
>> Also, I cc this patch to ocfs2 maintainers.
>>
>> Thanks,
>> Changwei
>>
>> On 2017/10/27 16:01, alex chen wrote:
>>> we should wait dio requests to finish before inode lock in
>>> ocfs2_setattr(), otherwise the following deadlock will be happened:
>>> process 1                  process 2                    process 3
>>> truncate file 'A'          end_io of writing file 'A'   receiving the bast messages
>>> ocfs2_setattr
>>>    ocfs2_inode_lock_tracker
>>>     ocfs2_inode_lock_full
>>>    inode_dio_wait
>>>     __inode_dio_wait
>>>     -->waiting for all dio
>>>     requests finish
>>>                                                           dlm_proxy_ast_handler
>>>                                                            dlm_do_local_bast
>>>                                                             ocfs2_blocking_ast
>>>                                                              ocfs2_generic_handle_bast
>>>                                                               set OCFS2_LOCK_BLOCKED flag
>>>                           dio_end_io
>>>                            dio_bio_end_aio
>>>                             dio_complete
>>>                              ocfs2_dio_end_io
>>>                               ocfs2_dio_end_io_write
>>>                                ocfs2_inode_lock
>>>                                 __ocfs2_cluster_lock
>>>                                  ocfs2_wait_for_mask
>>>                                  -->waiting for OCFS2_LOCK_BLOCKED
>>>                                  flag to be cleared, that is waiting
>>>                                  for 'process 1' unlocking the inode lock
>>>                              inode_dio_end
>>>                              -->here dec the i_dio_count, but will never
>>>                              be called, so a deadlock happened.
>>>
>>> Signed-off-by: Alex Chen <alex.chen@huawei.com>
>>> Reviewed-by: Jun Piao <piaojun@huawei.com>
Acked-by: Changwei Ge <ge.changwei@h3c.com>

>>>
>>> ---
>>>    fs/ocfs2/file.c | 9 +++++++--
>>>    1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
>>> index 6e41fc8..50e09a6 100644
>>> --- a/fs/ocfs2/file.c
>>> +++ b/fs/ocfs2/file.c
>>> @@ -1161,6 +1161,13 @@ int ocfs2_setattr(struct dentry *dentry, struct iattr *attr)
>>>    	}
>>>    	size_change = S_ISREG(inode->i_mode) && attr->ia_valid & ATTR_SIZE;
>>>    	if (size_change) {
>>> +
>>> +		/* here we should wait dio to finish before inode lock
>>> +		 * to avoid a deadlock between ocfs2_setattr() and
>>> +		 * ocfs2_dio_end_io_write()
>>> +		 */
>>> +		inode_dio_wait(inode);
>>> +
>>>    		status = ocfs2_rw_lock(inode, 1);
>>>    		if (status < 0) {
>>>    			mlog_errno(status);
>>> @@ -1200,8 +1207,6 @@ int ocfs2_setattr(struct dentry *dentry, struct iattr *attr)
>>>    		if (status)
>>>    			goto bail_unlock;
>>>
>>> -		inode_dio_wait(inode);
>>> -
>>>    		if (i_size_read(inode) >= attr->ia_size) {
>>>    			if (ocfs2_should_order_data(inode)) {
>>>    				status = ocfs2_begin_ordered_truncate(inode,
>>>
>>
>>
>> .
>>
> 
> 

  reply	other threads:[~2017-10-28  8:12 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-27  7:57 [Ocfs2-devel] [PATCH] ocfs2: should wait dio before inode lock in ocfs2_setattr() alex chen
2017-10-27 10:21 ` Changwei Ge
2017-10-28  2:34   ` alex chen
2017-10-28  8:12     ` Changwei Ge [this message]
2017-10-30 12:30 ` Joseph Qi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63ADC13FD55D6546B7DECE290D39E373CED72176@H3CMLB14-EX.srv.huawei-3com.com \
    --to=ge.changwei@h3c.com \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.