All of lore.kernel.org
 help / color / mirror / Atom feed
* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
@ 2014-01-09  0:12 Goldwyn Rodrigues
  2014-01-09  1:29 ` Srinivas Eeda
  0 siblings, 1 reply; 14+ messages in thread
From: Goldwyn Rodrigues @ 2014-01-09  0:12 UTC (permalink / raw)
  To: ocfs2-devel

Hi,

From the comments in fs/ocfs2/inode.h:90 it seems, this was used in
legacy ocfs2 systems when a node received unlink votes. Since unlink
votes has been done away with and replaced with open locks, is this
flag still required? If yes, why?

From my ongoing investigation of unlink() times, it seems this flag is
causing the delay with releasing the open locks while downconverting
dentry locks. The flag is set  _everytime_ a dentry downconvert is
performed even if the file  is not scheduled to be deleted. If not, we
can be smartly evict the inodes which are *not* to be deleted
(i_nlink>0) by not offloading to ocfs2_wq. This way open lock will
release faster speeding up unlink on the deleting node.

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09  0:12 [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED? Goldwyn Rodrigues
@ 2014-01-09  1:29 ` Srinivas Eeda
  2014-01-09  3:12   ` Goldwyn Rodrigues
  0 siblings, 1 reply; 14+ messages in thread
From: Srinivas Eeda @ 2014-01-09  1:29 UTC (permalink / raw)
  To: ocfs2-devel

Hi Goldwyn,

On 01/08/2014 04:12 PM, Goldwyn Rodrigues wrote:
> Hi,
>
> >From the comments in fs/ocfs2/inode.h:90 it seems, this was used in
> legacy ocfs2 systems when a node received unlink votes. Since unlink
> votes has been done away with and replaced with open locks, is this
> flag still required? If yes, why?
My understanding is that unlink voting protocol was heavy. So the 
following was done to address it.

To do an unlink, dentry has to be removed. In order to do that the node 
has to get EX lock on the dentry which means all other nodes have to 
downconvert. In general EX lock on dentry is acquired only in unlink and 
I assume rename case. So all nodes which down convert the lock mark 
their inode OCFS2_INODE_MAYBE_ORPHANED. The only problem with this is 
that dentry on a node can get purged because of memory pressure which 
marks inode as OCFS2_INODE_MAYBE_ORPHANED even when no unlink was done 
on this inode.


> >From my ongoing investigation of unlink() times, it seems this flag is
> causing the delay with releasing the open locks while downconverting
> dentry locks. The flag is set  _everytime_ a dentry downconvert is
> performed even if the file  is not scheduled to be deleted. If not, we
> can be smartly evict the inodes which are *not* to be deleted
> (i_nlink>0) by not offloading to ocfs2_wq. This way open lock will
> release faster speeding up unlink on the deleting node.
>
>
Are you referring to the delay caused by ocfs2_drop_dentry_lock queueing 
dentry locks to dentry_lock_list ?. If that's the case, have you tried 
removing following patches which introduced that behavior ? I think that 
quota's deadlock bug might have to be addressed differently ?

ea455f8ab68338ba69f5d3362b342c115bea8e13
eb90e46458b08bc7c1c96420ca0eb4263dc1d6e5
bb44bf820481e19381ec549118e4ee0b89d56191

The above patches were leaving orphan files around which was causing a 
big problem to some applications that removes lot of files which inturn 
caused intermittent hangs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09  1:29 ` Srinivas Eeda
@ 2014-01-09  3:12   ` Goldwyn Rodrigues
  2014-01-09  5:30     ` Srinivas Eeda
  0 siblings, 1 reply; 14+ messages in thread
From: Goldwyn Rodrigues @ 2014-01-09  3:12 UTC (permalink / raw)
  To: ocfs2-devel

Hi Srini,

On 01/08/2014 07:29 PM, Srinivas Eeda wrote:
> Hi Goldwyn,
>
> On 01/08/2014 04:12 PM, Goldwyn Rodrigues wrote:
>> Hi,
>>
>> >From the comments in fs/ocfs2/inode.h:90 it seems, this was used in
>> legacy ocfs2 systems when a node received unlink votes. Since unlink
>> votes has been done away with and replaced with open locks, is this
>> flag still required? If yes, why?
> My understanding is that unlink voting protocol was heavy. So the
> following was done to address it.
>
> To do an unlink, dentry has to be removed. In order to do that the node
> has to get EX lock on the dentry which means all other nodes have to
> downconvert. In general EX lock on dentry is acquired only in unlink and
> I assume rename case. So all nodes which down convert the lock mark
> their inode OCFS2_INODE_MAYBE_ORPHANED. The only problem with this is
> that dentry on a node can get purged because of memory pressure which
> marks inode as OCFS2_INODE_MAYBE_ORPHANED even when no unlink was done
> on this inode.
>

I think you are getting confused between dentry_lock (dentry_lockres) 
and open lock (ip_open_lockres). AFAICS, dentry locks are used to 
control the remote dentries.

>
>> >From my ongoing investigation of unlink() times, it seems this flag is
>> causing the delay with releasing the open locks while downconverting
>> dentry locks. The flag is set  _everytime_ a dentry downconvert is
>> performed even if the file  is not scheduled to be deleted. If not, we
>> can be smartly evict the inodes which are *not* to be deleted
>> (i_nlink>0) by not offloading to ocfs2_wq. This way open lock will
>> release faster speeding up unlink on the deleting node.
>>
>>
> Are you referring to the delay caused by ocfs2_drop_dentry_lock queueing
> dentry locks to dentry_lock_list ?. If that's the case, have you tried
> removing following patches which introduced that behavior ? I think that
> quota's deadlock bug might have to be addressed differently ?
>
> ea455f8ab68338ba69f5d3362b342c115bea8e13

Yes, that should make some difference. Let me try that. However, I was 
suggesting we do not set the OCFS2_INODE_MAYBE_ORPHANED flag in 
ocfs2_dentry_convert_worker as well, but I am not sure of the 
consequences and that is the reason I asked why it is used.

> eb90e46458b08bc7c1c96420ca0eb4263dc1d6e5
> bb44bf820481e19381ec549118e4ee0b89d56191

I did not find these gits. Which tree are you referring to?

>
> The above patches were leaving orphan files around which was causing a
> big problem to some applications that removes lot of files which inturn
> caused intermittent hangs
>


-- 
Goldwyn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09  3:12   ` Goldwyn Rodrigues
@ 2014-01-09  5:30     ` Srinivas Eeda
  2014-01-09 10:23       ` Joel Becker
  2014-01-09 15:44       ` Goldwyn Rodrigues
  0 siblings, 2 replies; 14+ messages in thread
From: Srinivas Eeda @ 2014-01-09  5:30 UTC (permalink / raw)
  To: ocfs2-devel

On 01/08/2014 07:12 PM, Goldwyn Rodrigues wrote:
> Hi Srini,
>
> On 01/08/2014 07:29 PM, Srinivas Eeda wrote:
>> Hi Goldwyn,
>>
>> On 01/08/2014 04:12 PM, Goldwyn Rodrigues wrote:
>>> Hi,
>>>
>>> >From the comments in fs/ocfs2/inode.h:90 it seems, this was used in
>>> legacy ocfs2 systems when a node received unlink votes. Since unlink
>>> votes has been done away with and replaced with open locks, is this
>>> flag still required? If yes, why?
>> My understanding is that unlink voting protocol was heavy. So the
>> following was done to address it.
>>
>> To do an unlink, dentry has to be removed. In order to do that the node
>> has to get EX lock on the dentry which means all other nodes have to
>> downconvert. In general EX lock on dentry is acquired only in unlink and
>> I assume rename case. So all nodes which down convert the lock mark
>> their inode OCFS2_INODE_MAYBE_ORPHANED. The only problem with this is
>> that dentry on a node can get purged because of memory pressure which
>> marks inode as OCFS2_INODE_MAYBE_ORPHANED even when no unlink was done
>> on this inode.
>>
>
> I think you are getting confused between dentry_lock (dentry_lockres) 
> and open lock (ip_open_lockres). AFAICS, dentry locks are used to 
> control the remote dentries.
I was trying to answer why we need OCFS2_INODE_MAYBE_ORPHANED flag, I 
guess I wasn't clear. I'll make an other attempt :).

One way for node A to tell node B that an unlink had happened on node A 
is by sending an explicit message(something similar to what we had in 
old release). When node B received such communication it marked inode 
with OCFS2_INODE_MAYBE_ORPHANED flag if it still had the inode in use.

The other way(current implementation) is to indirectly tell it by asking 
node B to purge dentry lockres. Once node B has been informed that 
dentry lock has to be released, it assumes inode might have been 
unlinked somewhere and marks the inode with OCFS2_INODE_MAYBE_ORPHANED flag.

So, we need OCFS2_INODE_MAYBE_ORPHANED flag to tell node B that it 
should finish the second phase of unlink(remove the inode from file 
system) when it closes the file.

>
>>
>>> >From my ongoing investigation of unlink() times, it seems this flag is
>>> causing the delay with releasing the open locks while downconverting
>>> dentry locks. The flag is set  _everytime_ a dentry downconvert is
>>> performed even if the file  is not scheduled to be deleted. If not, we
>>> can be smartly evict the inodes which are *not* to be deleted
>>> (i_nlink>0) by not offloading to ocfs2_wq. This way open lock will
>>> release faster speeding up unlink on the deleting node.
>>>
>>>
>> Are you referring to the delay caused by ocfs2_drop_dentry_lock queueing
>> dentry locks to dentry_lock_list ?. If that's the case, have you tried
>> removing following patches which introduced that behavior ? I think that
>> quota's deadlock bug might have to be addressed differently ?
>>
>> ea455f8ab68338ba69f5d3362b342c115bea8e13
>
> Yes, that should make some difference. Let me try that. However, I was 
> suggesting we do not set the OCFS2_INODE_MAYBE_ORPHANED flag in 
> ocfs2_dentry_convert_worker as well, but I am not sure of the 
> consequences and that is the reason I asked why it is used.
>
>> eb90e46458b08bc7c1c96420ca0eb4263dc1d6e5
>> bb44bf820481e19381ec549118e4ee0b89d56191
>
> I did not find these gits. Which tree are you referring to?

Sorry, my bad. Those commit id's were from my local repo. I meant
f7b1aa69be138ad9d7d3f31fa56f4c9407f56b6a and
5fd131893793567c361ae64cbeb28a2a753bbe35
>
>>
>> The above patches were leaving orphan files around which was causing a
>> big problem to some applications that removes lot of files which inturn
>> caused intermittent hangs
>>
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09  5:30     ` Srinivas Eeda
@ 2014-01-09 10:23       ` Joel Becker
  2014-01-09 13:35         ` Goldwyn Rodrigues
  2014-01-09 15:44       ` Goldwyn Rodrigues
  1 sibling, 1 reply; 14+ messages in thread
From: Joel Becker @ 2014-01-09 10:23 UTC (permalink / raw)
  To: ocfs2-devel

Unlink can happen from anywhere, but only the last closing node can
actually remove the file.  MAYBE_ORPHANED tells the node to try for
removal at close time.  It is absolutely necessary.

Joel

On Wed, Jan 08, 2014 at 09:30:49PM -0800, Srinivas Eeda wrote:
> On 01/08/2014 07:12 PM, Goldwyn Rodrigues wrote:
> > Hi Srini,
> >
> > On 01/08/2014 07:29 PM, Srinivas Eeda wrote:
> >> Hi Goldwyn,
> >>
> >> On 01/08/2014 04:12 PM, Goldwyn Rodrigues wrote:
> >>> Hi,
> >>>
> >>> >From the comments in fs/ocfs2/inode.h:90 it seems, this was used in
> >>> legacy ocfs2 systems when a node received unlink votes. Since unlink
> >>> votes has been done away with and replaced with open locks, is this
> >>> flag still required? If yes, why?
> >> My understanding is that unlink voting protocol was heavy. So the
> >> following was done to address it.
> >>
> >> To do an unlink, dentry has to be removed. In order to do that the node
> >> has to get EX lock on the dentry which means all other nodes have to
> >> downconvert. In general EX lock on dentry is acquired only in unlink and
> >> I assume rename case. So all nodes which down convert the lock mark
> >> their inode OCFS2_INODE_MAYBE_ORPHANED. The only problem with this is
> >> that dentry on a node can get purged because of memory pressure which
> >> marks inode as OCFS2_INODE_MAYBE_ORPHANED even when no unlink was done
> >> on this inode.
> >>
> >
> > I think you are getting confused between dentry_lock (dentry_lockres) 
> > and open lock (ip_open_lockres). AFAICS, dentry locks are used to 
> > control the remote dentries.
> I was trying to answer why we need OCFS2_INODE_MAYBE_ORPHANED flag, I 
> guess I wasn't clear. I'll make an other attempt :).
> 
> One way for node A to tell node B that an unlink had happened on node A 
> is by sending an explicit message(something similar to what we had in 
> old release). When node B received such communication it marked inode 
> with OCFS2_INODE_MAYBE_ORPHANED flag if it still had the inode in use.
> 
> The other way(current implementation) is to indirectly tell it by asking 
> node B to purge dentry lockres. Once node B has been informed that 
> dentry lock has to be released, it assumes inode might have been 
> unlinked somewhere and marks the inode with OCFS2_INODE_MAYBE_ORPHANED flag.
> 
> So, we need OCFS2_INODE_MAYBE_ORPHANED flag to tell node B that it 
> should finish the second phase of unlink(remove the inode from file 
> system) when it closes the file.
> 
> >
> >>
> >>> >From my ongoing investigation of unlink() times, it seems this flag is
> >>> causing the delay with releasing the open locks while downconverting
> >>> dentry locks. The flag is set  _everytime_ a dentry downconvert is
> >>> performed even if the file  is not scheduled to be deleted. If not, we
> >>> can be smartly evict the inodes which are *not* to be deleted
> >>> (i_nlink>0) by not offloading to ocfs2_wq. This way open lock will
> >>> release faster speeding up unlink on the deleting node.
> >>>
> >>>
> >> Are you referring to the delay caused by ocfs2_drop_dentry_lock queueing
> >> dentry locks to dentry_lock_list ?. If that's the case, have you tried
> >> removing following patches which introduced that behavior ? I think that
> >> quota's deadlock bug might have to be addressed differently ?
> >>
> >> ea455f8ab68338ba69f5d3362b342c115bea8e13
> >
> > Yes, that should make some difference. Let me try that. However, I was 
> > suggesting we do not set the OCFS2_INODE_MAYBE_ORPHANED flag in 
> > ocfs2_dentry_convert_worker as well, but I am not sure of the 
> > consequences and that is the reason I asked why it is used.
> >
> >> eb90e46458b08bc7c1c96420ca0eb4263dc1d6e5
> >> bb44bf820481e19381ec549118e4ee0b89d56191
> >
> > I did not find these gits. Which tree are you referring to?
> 
> Sorry, my bad. Those commit id's were from my local repo. I meant
> f7b1aa69be138ad9d7d3f31fa56f4c9407f56b6a and
> 5fd131893793567c361ae64cbeb28a2a753bbe35
> >
> >>
> >> The above patches were leaving orphan files around which was causing a
> >> big problem to some applications that removes lot of files which inturn
> >> caused intermittent hangs
> >>
> >
> >
> 
> 
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-devel

-- 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09 10:23       ` Joel Becker
@ 2014-01-09 13:35         ` Goldwyn Rodrigues
  2014-01-13 15:39           ` Joel Becker
  0 siblings, 1 reply; 14+ messages in thread
From: Goldwyn Rodrigues @ 2014-01-09 13:35 UTC (permalink / raw)
  To: ocfs2-devel

On 01/09/2014 04:23 AM, Joel Becker wrote:
> Unlink can happen from anywhere, but only the last closing node can
> actually remove the file.  MAYBE_ORPHANED tells the node to try for
> removal at close time.  It is absolutely necessary.
>

The reason I asked the query is that OCFS2_INODE_MAYBE_ORPHANED is being 
set at every dentry downconvert. Is this really necessary because every 
dentry downconvert does not turn into unlink? (I know it says maybe :/ )

Is it okay to set it when the open_lock fails or is it too late in the 
process? If another node has performed an unlink, it would need to get 
the open lock before it performs the inode wipe. So we should be safe 
that way? Is there anything incorrect in this design?


-- 
Goldwyn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09  5:30     ` Srinivas Eeda
  2014-01-09 10:23       ` Joel Becker
@ 2014-01-09 15:44       ` Goldwyn Rodrigues
  2014-01-09 16:06         ` Srinivas Eeda
  1 sibling, 1 reply; 14+ messages in thread
From: Goldwyn Rodrigues @ 2014-01-09 15:44 UTC (permalink / raw)
  To: ocfs2-devel

Hi Srini,

Thanks for the reply.

On 01/08/2014 11:30 PM, Srinivas Eeda wrote:
>>>> >From the comments in fs/ocfs2/inode.h:90 it seems, this was used in
>>>> legacy ocfs2 systems when a node received unlink votes. Since unlink
>>>> votes has been done away with and replaced with open locks, is this
>>>> flag still required? If yes, why?
>>> My understanding is that unlink voting protocol was heavy. So the
>>> following was done to address it.
>>>
>>> To do an unlink, dentry has to be removed. In order to do that the node
>>> has to get EX lock on the dentry which means all other nodes have to
>>> downconvert. In general EX lock on dentry is acquired only in unlink and
>>> I assume rename case. So all nodes which down convert the lock mark
>>> their inode OCFS2_INODE_MAYBE_ORPHANED. The only problem with this is
>>> that dentry on a node can get purged because of memory pressure which
>>> marks inode as OCFS2_INODE_MAYBE_ORPHANED even when no unlink was done
>>> on this inode.
>>>
>>
>> I think you are getting confused between dentry_lock (dentry_lockres)
>> and open lock (ip_open_lockres). AFAICS, dentry locks are used to
>> control the remote dentries.
> I was trying to answer why we need OCFS2_INODE_MAYBE_ORPHANED flag, I
> guess I wasn't clear. I'll make an other attempt :).
>
> One way for node A to tell node B that an unlink had happened on node A
> is by sending an explicit message(something similar to what we had in
> old release). When node B received such communication it marked inode
> with OCFS2_INODE_MAYBE_ORPHANED flag if it still had the inode in use.
>
> The other way(current implementation) is to indirectly tell it by asking
> node B to purge dentry lockres. Once node B has been informed that
> dentry lock has to be released, it assumes inode might have been
> unlinked somewhere and marks the inode with OCFS2_INODE_MAYBE_ORPHANED
> flag.
>
> So, we need OCFS2_INODE_MAYBE_ORPHANED flag to tell node B that it
> should finish the second phase of unlink(remove the inode from file
> system) when it closes the file.

Okay, but  why should node B do the cleanup/wipe when node A initiated 
the unlink()? Shouldn't it be done by node A? All node B should do is to 
write the inode and clear it from the cache. The sequence is 
synchronized by dentry_lock. Right?

We are performing ocfs2_inode_lock() anyways which is re-reading the 
inode from disk (for node A)


>
>>
>>>
>>>> >From my ongoing investigation of unlink() times, it seems this flag is
>>>> causing the delay with releasing the open locks while downconverting
>>>> dentry locks. The flag is set  _everytime_ a dentry downconvert is
>>>> performed even if the file  is not scheduled to be deleted. If not, we
>>>> can be smartly evict the inodes which are *not* to be deleted
>>>> (i_nlink>0) by not offloading to ocfs2_wq. This way open lock will
>>>> release faster speeding up unlink on the deleting node.
>>>>
>>>>
>>> Are you referring to the delay caused by ocfs2_drop_dentry_lock queueing
>>> dentry locks to dentry_lock_list ?. If that's the case, have you tried
>>> removing following patches which introduced that behavior ? I think that
>>> quota's deadlock bug might have to be addressed differently ?
>>>
>>> ea455f8ab68338ba69f5d3362b342c115bea8e13
>>
>> Yes, that should make some difference. Let me try that. However, I was
>> suggesting we do not set the OCFS2_INODE_MAYBE_ORPHANED flag in
>> ocfs2_dentry_convert_worker as well, but I am not sure of the
>> consequences and that is the reason I asked why it is used.
>>
>>> eb90e46458b08bc7c1c96420ca0eb4263dc1d6e5
>>> bb44bf820481e19381ec549118e4ee0b89d56191
>>
>> I did not find these gits. Which tree are you referring to?
>
> Sorry, my bad. Those commit id's were from my local repo. I meant
> f7b1aa69be138ad9d7d3f31fa56f4c9407f56b6a and
> 5fd131893793567c361ae64cbeb28a2a753bbe35
>>
>>>
>>> The above patches were leaving orphan files around which was causing a
>>> big problem to some applications that removes lot of files which inturn
>>> caused intermittent hangs

I think if we don't (ab)use OCFS2_INODE_MAYBE_ORPHANED, we should be 
better in this case as well, though I am not sure as of now.

Should I write a trial patch to explain better?

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09 15:44       ` Goldwyn Rodrigues
@ 2014-01-09 16:06         ` Srinivas Eeda
  2014-01-09 16:34           ` Goldwyn Rodrigues
  0 siblings, 1 reply; 14+ messages in thread
From: Srinivas Eeda @ 2014-01-09 16:06 UTC (permalink / raw)
  To: ocfs2-devel

On 01/09/2014 07:44 AM, Goldwyn Rodrigues wrote:
> Hi Srini,
>
> Thanks for the reply.
>
> On 01/08/2014 11:30 PM, Srinivas Eeda wrote:
>>>>> >From the comments in fs/ocfs2/inode.h:90 it seems, this was used in
>>>>> legacy ocfs2 systems when a node received unlink votes. Since unlink
>>>>> votes has been done away with and replaced with open locks, is this
>>>>> flag still required? If yes, why?
>>>> My understanding is that unlink voting protocol was heavy. So the
>>>> following was done to address it.
>>>>
>>>> To do an unlink, dentry has to be removed. In order to do that the 
>>>> node
>>>> has to get EX lock on the dentry which means all other nodes have to
>>>> downconvert. In general EX lock on dentry is acquired only in 
>>>> unlink and
>>>> I assume rename case. So all nodes which down convert the lock mark
>>>> their inode OCFS2_INODE_MAYBE_ORPHANED. The only problem with this is
>>>> that dentry on a node can get purged because of memory pressure which
>>>> marks inode as OCFS2_INODE_MAYBE_ORPHANED even when no unlink was done
>>>> on this inode.
>>>>
>>>
>>> I think you are getting confused between dentry_lock (dentry_lockres)
>>> and open lock (ip_open_lockres). AFAICS, dentry locks are used to
>>> control the remote dentries.
>> I was trying to answer why we need OCFS2_INODE_MAYBE_ORPHANED flag, I
>> guess I wasn't clear. I'll make an other attempt :).
>>
>> One way for node A to tell node B that an unlink had happened on node A
>> is by sending an explicit message(something similar to what we had in
>> old release). When node B received such communication it marked inode
>> with OCFS2_INODE_MAYBE_ORPHANED flag if it still had the inode in use.
>>
>> The other way(current implementation) is to indirectly tell it by asking
>> node B to purge dentry lockres. Once node B has been informed that
>> dentry lock has to be released, it assumes inode might have been
>> unlinked somewhere and marks the inode with OCFS2_INODE_MAYBE_ORPHANED
>> flag.
>>
>> So, we need OCFS2_INODE_MAYBE_ORPHANED flag to tell node B that it
>> should finish the second phase of unlink(remove the inode from file
>> system) when it closes the file.
>
> Okay, but  why should node B do the cleanup/wipe when node A initiated 
> the unlink()? Shouldn't it be done by node A? All node B should do is 
> to write the inode and clear it from the cache. The sequence is 
> synchronized by dentry_lock. Right?
removing dentry is only the first part. An inode can still be open after 
that.

>
> We are performing ocfs2_inode_lock() anyways which is re-reading the 
> inode from disk (for node A)
node B should do the cleanup in this case because it is the last node to 
close the file. Node A, will do the first phase of unlink(remove dentry) 
and node B will do the second phase of unlink(remove the inode).

>
>
>>
>>>
>>>>
>>>>> >From my ongoing investigation of unlink() times, it seems this 
>>>>> flag is
>>>>> causing the delay with releasing the open locks while downconverting
>>>>> dentry locks. The flag is set  _everytime_ a dentry downconvert is
>>>>> performed even if the file  is not scheduled to be deleted. If 
>>>>> not, we
>>>>> can be smartly evict the inodes which are *not* to be deleted
>>>>> (i_nlink>0) by not offloading to ocfs2_wq. This way open lock will
>>>>> release faster speeding up unlink on the deleting node.
>>>>>
>>>>>
>>>> Are you referring to the delay caused by ocfs2_drop_dentry_lock 
>>>> queueing
>>>> dentry locks to dentry_lock_list ?. If that's the case, have you tried
>>>> removing following patches which introduced that behavior ? I think 
>>>> that
>>>> quota's deadlock bug might have to be addressed differently ?
>>>>
>>>> ea455f8ab68338ba69f5d3362b342c115bea8e13
>>>
>>> Yes, that should make some difference. Let me try that. However, I was
>>> suggesting we do not set the OCFS2_INODE_MAYBE_ORPHANED flag in
>>> ocfs2_dentry_convert_worker as well, but I am not sure of the
>>> consequences and that is the reason I asked why it is used.
>>>
>>>> eb90e46458b08bc7c1c96420ca0eb4263dc1d6e5
>>>> bb44bf820481e19381ec549118e4ee0b89d56191
>>>
>>> I did not find these gits. Which tree are you referring to?
>>
>> Sorry, my bad. Those commit id's were from my local repo. I meant
>> f7b1aa69be138ad9d7d3f31fa56f4c9407f56b6a and
>> 5fd131893793567c361ae64cbeb28a2a753bbe35
>>>
>>>>
>>>> The above patches were leaving orphan files around which was causing a
>>>> big problem to some applications that removes lot of files which 
>>>> inturn
>>>> caused intermittent hangs
>
> I think if we don't (ab)use OCFS2_INODE_MAYBE_ORPHANED, we should be 
> better in this case as well, though I am not sure as of now.
>
> Should I write a trial patch to explain better?
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09 16:06         ` Srinivas Eeda
@ 2014-01-09 16:34           ` Goldwyn Rodrigues
  2014-01-09 17:04             ` Srinivas Eeda
  0 siblings, 1 reply; 14+ messages in thread
From: Goldwyn Rodrigues @ 2014-01-09 16:34 UTC (permalink / raw)
  To: ocfs2-devel

On 01/09/2014 10:06 AM, Srinivas Eeda wrote:
> On 01/09/2014 07:44 AM, Goldwyn Rodrigues wrote:
>> Hi Srini,
>>
>> Thanks for the reply.
>>
>> On 01/08/2014 11:30 PM, Srinivas Eeda wrote:
>>>>>> >From the comments in fs/ocfs2/inode.h:90 it seems, this was used in
>>>>>> legacy ocfs2 systems when a node received unlink votes. Since unlink
>>>>>> votes has been done away with and replaced with open locks, is this
>>>>>> flag still required? If yes, why?
>>>>> My understanding is that unlink voting protocol was heavy. So the
>>>>> following was done to address it.
>>>>>
>>>>> To do an unlink, dentry has to be removed. In order to do that the
>>>>> node
>>>>> has to get EX lock on the dentry which means all other nodes have to
>>>>> downconvert. In general EX lock on dentry is acquired only in
>>>>> unlink and
>>>>> I assume rename case. So all nodes which down convert the lock mark
>>>>> their inode OCFS2_INODE_MAYBE_ORPHANED. The only problem with this is
>>>>> that dentry on a node can get purged because of memory pressure which
>>>>> marks inode as OCFS2_INODE_MAYBE_ORPHANED even when no unlink was done
>>>>> on this inode.
>>>>>
>>>>
>>>> I think you are getting confused between dentry_lock (dentry_lockres)
>>>> and open lock (ip_open_lockres). AFAICS, dentry locks are used to
>>>> control the remote dentries.
>>> I was trying to answer why we need OCFS2_INODE_MAYBE_ORPHANED flag, I
>>> guess I wasn't clear. I'll make an other attempt :).
>>>
>>> One way for node A to tell node B that an unlink had happened on node A
>>> is by sending an explicit message(something similar to what we had in
>>> old release). When node B received such communication it marked inode
>>> with OCFS2_INODE_MAYBE_ORPHANED flag if it still had the inode in use.
>>>
>>> The other way(current implementation) is to indirectly tell it by asking
>>> node B to purge dentry lockres. Once node B has been informed that
>>> dentry lock has to be released, it assumes inode might have been
>>> unlinked somewhere and marks the inode with OCFS2_INODE_MAYBE_ORPHANED
>>> flag.
>>>
>>> So, we need OCFS2_INODE_MAYBE_ORPHANED flag to tell node B that it
>>> should finish the second phase of unlink(remove the inode from file
>>> system) when it closes the file.
>>
>> Okay, but  why should node B do the cleanup/wipe when node A initiated
>> the unlink()? Shouldn't it be done by node A? All node B should do is
>> to write the inode and clear it from the cache. The sequence is
>> synchronized by dentry_lock. Right?
> removing dentry is only the first part. An inode can still be open after
> that.

Yes, I did not consider that.
How about using open locks ro_holders count to identify this? That may 
just work. Thanks!

>
>>
>> We are performing ocfs2_inode_lock() anyways which is re-reading the
>> inode from disk (for node A)
> node B should do the cleanup in this case because it is the last node to
> close the file. Node A, will do the first phase of unlink(remove dentry)
> and node B will do the second phase of unlink(remove the inode).
>

Agree. Lets see if we can use open locks for this.

Thanks!

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09 16:34           ` Goldwyn Rodrigues
@ 2014-01-09 17:04             ` Srinivas Eeda
  2014-01-09 17:27               ` Goldwyn Rodrigues
  0 siblings, 1 reply; 14+ messages in thread
From: Srinivas Eeda @ 2014-01-09 17:04 UTC (permalink / raw)
  To: ocfs2-devel

On 01/09/2014 08:34 AM, Goldwyn Rodrigues wrote:
> On 01/09/2014 10:06 AM, Srinivas Eeda wrote:
>> On 01/09/2014 07:44 AM, Goldwyn Rodrigues wrote:
>>> Hi Srini,
>>>
>>> Thanks for the reply.
>>>
>>> On 01/08/2014 11:30 PM, Srinivas Eeda wrote:
>>>>>>> >From the comments in fs/ocfs2/inode.h:90 it seems, this was 
>>>>>>> used in
>>>>>>> legacy ocfs2 systems when a node received unlink votes. Since 
>>>>>>> unlink
>>>>>>> votes has been done away with and replaced with open locks, is this
>>>>>>> flag still required? If yes, why?
>>>>>> My understanding is that unlink voting protocol was heavy. So the
>>>>>> following was done to address it.
>>>>>>
>>>>>> To do an unlink, dentry has to be removed. In order to do that the
>>>>>> node
>>>>>> has to get EX lock on the dentry which means all other nodes have to
>>>>>> downconvert. In general EX lock on dentry is acquired only in
>>>>>> unlink and
>>>>>> I assume rename case. So all nodes which down convert the lock mark
>>>>>> their inode OCFS2_INODE_MAYBE_ORPHANED. The only problem with 
>>>>>> this is
>>>>>> that dentry on a node can get purged because of memory pressure 
>>>>>> which
>>>>>> marks inode as OCFS2_INODE_MAYBE_ORPHANED even when no unlink was 
>>>>>> done
>>>>>> on this inode.
>>>>>>
>>>>>
>>>>> I think you are getting confused between dentry_lock (dentry_lockres)
>>>>> and open lock (ip_open_lockres). AFAICS, dentry locks are used to
>>>>> control the remote dentries.
>>>> I was trying to answer why we need OCFS2_INODE_MAYBE_ORPHANED flag, I
>>>> guess I wasn't clear. I'll make an other attempt :).
>>>>
>>>> One way for node A to tell node B that an unlink had happened on 
>>>> node A
>>>> is by sending an explicit message(something similar to what we had in
>>>> old release). When node B received such communication it marked inode
>>>> with OCFS2_INODE_MAYBE_ORPHANED flag if it still had the inode in use.
>>>>
>>>> The other way(current implementation) is to indirectly tell it by 
>>>> asking
>>>> node B to purge dentry lockres. Once node B has been informed that
>>>> dentry lock has to be released, it assumes inode might have been
>>>> unlinked somewhere and marks the inode with OCFS2_INODE_MAYBE_ORPHANED
>>>> flag.
>>>>
>>>> So, we need OCFS2_INODE_MAYBE_ORPHANED flag to tell node B that it
>>>> should finish the second phase of unlink(remove the inode from file
>>>> system) when it closes the file.
>>>
>>> Okay, but  why should node B do the cleanup/wipe when node A initiated
>>> the unlink()? Shouldn't it be done by node A? All node B should do is
>>> to write the inode and clear it from the cache. The sequence is
>>> synchronized by dentry_lock. Right?
>> removing dentry is only the first part. An inode can still be open after
>> that.
>
> Yes, I did not consider that.
> How about using open locks ro_holders count to identify this? That may 
> just work. Thanks!
One problem I see in using open lock for this is it could be late. 
Consider the scenario where node A removes the dentry and then the node 
crashes before trying the try_open_lock. Node B does the file close 
later but it doesn't know that the file was unlinked and doesn't do the 
clean up.

To me it appears OCFS2_INODE_MAYBE_ORPHANED is necessary. Any delay it 
is causing must be addressed differently.

Removing those 3 patches I pointed should help in synchronously deleting 
the file. But I am not sure if it would speed up the process.

>
>>
>>>
>>> We are performing ocfs2_inode_lock() anyways which is re-reading the
>>> inode from disk (for node A)
>> node B should do the cleanup in this case because it is the last node to
>> close the file. Node A, will do the first phase of unlink(remove dentry)
>> and node B will do the second phase of unlink(remove the inode).
>>
>
> Agree. Lets see if we can use open locks for this.
>
> Thanks!
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09 17:04             ` Srinivas Eeda
@ 2014-01-09 17:27               ` Goldwyn Rodrigues
  2014-01-13 15:42                 ` Joel Becker
  0 siblings, 1 reply; 14+ messages in thread
From: Goldwyn Rodrigues @ 2014-01-09 17:27 UTC (permalink / raw)
  To: ocfs2-devel

On 01/09/2014 11:04 AM, Srinivas Eeda wrote:
> On 01/09/2014 08:34 AM, Goldwyn Rodrigues wrote:
>> On 01/09/2014 10:06 AM, Srinivas Eeda wrote:
>>> On 01/09/2014 07:44 AM, Goldwyn Rodrigues wrote:
>>>> Hi Srini,
>>>>
>>>> Thanks for the reply.
>>>>
>>>> On 01/08/2014 11:30 PM, Srinivas Eeda wrote:
>>>>>>>> >From the comments in fs/ocfs2/inode.h:90 it seems, this was
>>>>>>>> used in
>>>>>>>> legacy ocfs2 systems when a node received unlink votes. Since
>>>>>>>> unlink
>>>>>>>> votes has been done away with and replaced with open locks, is this
>>>>>>>> flag still required? If yes, why?
>>>>>>> My understanding is that unlink voting protocol was heavy. So the
>>>>>>> following was done to address it.
>>>>>>>
>>>>>>> To do an unlink, dentry has to be removed. In order to do that the
>>>>>>> node
>>>>>>> has to get EX lock on the dentry which means all other nodes have to
>>>>>>> downconvert. In general EX lock on dentry is acquired only in
>>>>>>> unlink and
>>>>>>> I assume rename case. So all nodes which down convert the lock mark
>>>>>>> their inode OCFS2_INODE_MAYBE_ORPHANED. The only problem with
>>>>>>> this is
>>>>>>> that dentry on a node can get purged because of memory pressure
>>>>>>> which
>>>>>>> marks inode as OCFS2_INODE_MAYBE_ORPHANED even when no unlink was
>>>>>>> done
>>>>>>> on this inode.
>>>>>>>
>>>>>>
>>>>>> I think you are getting confused between dentry_lock (dentry_lockres)
>>>>>> and open lock (ip_open_lockres). AFAICS, dentry locks are used to
>>>>>> control the remote dentries.
>>>>> I was trying to answer why we need OCFS2_INODE_MAYBE_ORPHANED flag, I
>>>>> guess I wasn't clear. I'll make an other attempt :).
>>>>>
>>>>> One way for node A to tell node B that an unlink had happened on
>>>>> node A
>>>>> is by sending an explicit message(something similar to what we had in
>>>>> old release). When node B received such communication it marked inode
>>>>> with OCFS2_INODE_MAYBE_ORPHANED flag if it still had the inode in use.
>>>>>
>>>>> The other way(current implementation) is to indirectly tell it by
>>>>> asking
>>>>> node B to purge dentry lockres. Once node B has been informed that
>>>>> dentry lock has to be released, it assumes inode might have been
>>>>> unlinked somewhere and marks the inode with OCFS2_INODE_MAYBE_ORPHANED
>>>>> flag.
>>>>>
>>>>> So, we need OCFS2_INODE_MAYBE_ORPHANED flag to tell node B that it
>>>>> should finish the second phase of unlink(remove the inode from file
>>>>> system) when it closes the file.
>>>>
>>>> Okay, but  why should node B do the cleanup/wipe when node A initiated
>>>> the unlink()? Shouldn't it be done by node A? All node B should do is
>>>> to write the inode and clear it from the cache. The sequence is
>>>> synchronized by dentry_lock. Right?
>>> removing dentry is only the first part. An inode can still be open after
>>> that.
>>
>> Yes, I did not consider that.
>> How about using open locks ro_holders count to identify this? That may
>> just work. Thanks!
> One problem I see in using open lock for this is it could be late.
> Consider the scenario where node A removes the dentry and then the node
> crashes before trying the try_open_lock. Node B does the file close
> later but it doesn't know that the file was unlinked and doesn't do the
> clean up.
>
> To me it appears OCFS2_INODE_MAYBE_ORPHANED is necessary. Any delay it
> is causing must be addressed differently.

No, I don't mean to remove the OCFS2_INODE_MAYBE_ORPHANED flag, but set 
it conditionally in ocfs2_dentry_convert_worker() based on the value of 
the open locks held.

I'll write a patch and test. Thanks!

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09 13:35         ` Goldwyn Rodrigues
@ 2014-01-13 15:39           ` Joel Becker
  0 siblings, 0 replies; 14+ messages in thread
From: Joel Becker @ 2014-01-13 15:39 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Jan 09, 2014 at 07:35:05AM -0600, Goldwyn Rodrigues wrote:
> On 01/09/2014 04:23 AM, Joel Becker wrote:
> >Unlink can happen from anywhere, but only the last closing node can
> >actually remove the file.  MAYBE_ORPHANED tells the node to try for
> >removal at close time.  It is absolutely necessary.
> >
> 
> The reason I asked the query is that OCFS2_INODE_MAYBE_ORPHANED is
> being set at every dentry downconvert. Is this really necessary
> because every dentry downconvert does not turn into unlink? (I know
> it says maybe :/ )
> 
> Is it okay to set it when the open_lock fails or is it too late in
> the process? If another node has performed an unlink, it would need
> to get the open lock before it performs the inode wipe. So we should
> be safe that way? Is there anything incorrect in this design?

It's not safe.  Srini has already answered this on the other part of the
thread.  I'll address your other comments there.

Joel

> 
> 
> -- 
> Goldwyn

-- 

"You look in her eyes, the music begins to play.
 Hopeless romantics, here we go again."

			http://www.jlbec.org/
			jlbec at evilplan.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-09 17:27               ` Goldwyn Rodrigues
@ 2014-01-13 15:42                 ` Joel Becker
  2014-01-15 15:35                   ` Goldwyn Rodrigues
  0 siblings, 1 reply; 14+ messages in thread
From: Joel Becker @ 2014-01-13 15:42 UTC (permalink / raw)
  To: ocfs2-devel

On Thu, Jan 09, 2014 at 11:27:15AM -0600, Goldwyn Rodrigues wrote:
> >> Yes, I did not consider that.
> >> How about using open locks ro_holders count to identify this? That may
> >> just work. Thanks!
> > One problem I see in using open lock for this is it could be late.
> > Consider the scenario where node A removes the dentry and then the node
> > crashes before trying the try_open_lock. Node B does the file close
> > later but it doesn't know that the file was unlinked and doesn't do the
> > clean up.
> >
> > To me it appears OCFS2_INODE_MAYBE_ORPHANED is necessary. Any delay it
> > is causing must be addressed differently.
> 
> No, I don't mean to remove the OCFS2_INODE_MAYBE_ORPHANED flag, but set 
> it conditionally in ocfs2_dentry_convert_worker() based on the value of 
> the open locks held.

I'm confused by what you are attempting here.  We hold the dentry lock
until the final dpu() (see the comment in fs/ocfs2/dcache.c).  We should
never have ro_holders==0 unless we're flushing the entry from the
dcache.  Do you mean something else?

Joel

-- 

"Now Someone's on the telephone, desperate in his pain.
 Someone's on the bathroom floor doing her cocaine.
 Someone's got his finger on the button in some room.
 No one can convince me we aren't gluttons for our doom."

			http://www.jlbec.org/
			jlbec at evilplan.org

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED?
  2014-01-13 15:42                 ` Joel Becker
@ 2014-01-15 15:35                   ` Goldwyn Rodrigues
  0 siblings, 0 replies; 14+ messages in thread
From: Goldwyn Rodrigues @ 2014-01-15 15:35 UTC (permalink / raw)
  To: ocfs2-devel

On 01/13/2014 09:42 AM, Joel Becker wrote:
> On Thu, Jan 09, 2014 at 11:27:15AM -0600, Goldwyn Rodrigues wrote:
>>>> Yes, I did not consider that.
>>>> How about using open locks ro_holders count to identify this? That may
>>>> just work. Thanks!
>>> One problem I see in using open lock for this is it could be late.
>>> Consider the scenario where node A removes the dentry and then the node
>>> crashes before trying the try_open_lock. Node B does the file close
>>> later but it doesn't know that the file was unlinked and doesn't do the
>>> clean up.
>>>
>>> To me it appears OCFS2_INODE_MAYBE_ORPHANED is necessary. Any delay it
>>> is causing must be addressed differently.
>>
>> No, I don't mean to remove the OCFS2_INODE_MAYBE_ORPHANED flag, but set
>> it conditionally in ocfs2_dentry_convert_worker() based on the value of
>> the open locks held.
>
> I'm confused by what you are attempting here.  We hold the dentry lock
> until the final dpu() (see the comment in fs/ocfs2/dcache.c).  We should
> never have ro_holders==0 unless we're flushing the entry from the
> dcache.  Do you mean something else?
>

No, I meant exactly that and I thought wrong. However, the patch 
reversals pointed out by Srini have helped.

Thanks for the inputs,

-- 
Goldwyn

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-01-15 15:35 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-01-09  0:12 [Ocfs2-devel] What's the need of OCFS2_INODE_MAYBE_ORPHANED? Goldwyn Rodrigues
2014-01-09  1:29 ` Srinivas Eeda
2014-01-09  3:12   ` Goldwyn Rodrigues
2014-01-09  5:30     ` Srinivas Eeda
2014-01-09 10:23       ` Joel Becker
2014-01-09 13:35         ` Goldwyn Rodrigues
2014-01-13 15:39           ` Joel Becker
2014-01-09 15:44       ` Goldwyn Rodrigues
2014-01-09 16:06         ` Srinivas Eeda
2014-01-09 16:34           ` Goldwyn Rodrigues
2014-01-09 17:04             ` Srinivas Eeda
2014-01-09 17:27               ` Goldwyn Rodrigues
2014-01-13 15:42                 ` Joel Becker
2014-01-15 15:35                   ` Goldwyn Rodrigues

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.