All of lore.kernel.org
 help / color / mirror / Atom feed
* RE: RE: Finding hardlinks
@ 2007-01-15 12:53 Noveck, Dave
  2007-01-16  6:06 ` [nfsv4] " Spencer Shepler
  0 siblings, 1 reply; 20+ messages in thread
From: Noveck, Dave @ 2007-01-15 12:53 UTC (permalink / raw)
  To: Benny Halevy, Trond Myklebust; +Cc: Spencer Shepler, nfs, nfsv4

I'm not going to be at Connectathon, but I could call in for a
discussion.=20

-----Original Message-----
From: Benny Halevy [mailto:bhalevy@panasas.com]=20
Sent: Monday, January 15, 2007 3:45 AM
To: Noveck, Dave; Trond Myklebust
Cc: Spencer Shepler; nfsv4@ietf.org; nfs@lists.sourceforge.net
Subject: Re: [nfsv4] RE: Finding hardlinks

How about discussing this topic in the upcoming Connectathon?

Benny

Noveck, Dave wrote:
> For now, I'm not going to address the controversial issues here,=20
> mainly because I haven't decided how I feel about them yet.
>=20
>      Whether allowing multiple filehandles per object is a good
>      or even reasonably acceptable idea.
>=20
>      What the fact that RFC3530 talks about implies about what
>      clients should do about the issue.
>=20
> One thing that I hope is not controversial is that the v4.1 spec=20
> should either get rid of this or make it clear and implementable.
> I expect plenty of controversy about which of those to choose, but=20
> hope that there isn't any about the proposition that we have to choose

> one of those two.
>=20
>> SECINFO information is, for instance, given out on a per-filehandle=20
>> basis, does that mean that the server will
> have
>> different security policies?=20
>=20
> Well yes, RFC3530 does say "The new SECINFO operation will allow the=20
> client to determine, on a per filehandle basis", but I think that just

> has to be considered as an error rather than indicating that if you=20
> have two different filehandles for the same object, they can have=20
> different security policies.  SECINFO in RFC3530 takes a directory fh=20
> and a name, so if there are multiple filehandles for the object with=20
> that name, there is no way for SECINFO to associate different policies

> with different filehandles.  All it has is the name to go by.  I think

> this should be corrected to "on a per-object basis" in the new spec no

> matter what we do on other issues.
>=20
> I think the principle here has to be that if we do allow multiple fh's

> to map to the same object, we require that they designate the same=20
> object, and thus it is not allowed for the server to act as if you=20
> have multiple different object with different characteristics.
>=20
> Similarly as to:
>=20
>> In some places, people haven't even started to think about the=20
>> consequences:
>>
>>     If GETATTR directed to the two filehandles does not return the
>>     fileid attribute for both of the handles, then it cannot be
>>     determined whether the two objects are the same.  Therefore,
>>     operations which depend on that knowledge (e.g., client side data
>>     caching) cannot be done reliably.
>=20
> I think they (and maybe "they" includes me, I haven't checked the=20
> history
> here) started to think about them, but went in a bad direction.
>=20
> The implication here that you can have a different set of attributes=20
> supported for the same object based on which filehandle is used to=20
> access the attributes is totally bogus.
>=20
> The definition of supp_attr says "The bit vector which would retrieve=20
> all mandatory and recommended attributes that are supported for this=20
> object.  The scope of this attribute applies to all objects with a=20
> matching fsid."  So having the same object have different attributes=20
> supported based on the filehandle used or even two objects in the same

> fs having different attributes supported, in particular having fileid=20
> supported for one and not the other just isn't valid.
>=20
>> The fact is that RFC3530 contains masses of rope with which to allow=20
>> server and client vendors to hang themselves.
>=20
> If that means simply making poor choices, then OK.  But if there are=20
> other cases where you feel that the specification of a feature is=20
> simply
>=20
> incoherent and the consequences not really thought out, then I think=20
> we need to discuss them and not propagate that state of affairs to
v4.1.
>=20
> -----Original Message-----
> From: Trond Myklebust [mailto:trond.myklebust@fys.uio.no]
> Sent: Friday, January 05, 2007 5:29 AM
> To: Benny Halevy
> Cc: Jan Harkes; Miklos Szeredi; nfsv4@ietf.org;=20
> linux-kernel@vger.kernel.org; Mikulas Patocka;=20
> linux-fsdevel@vger.kernel.org; Jeff Layton; Arjan van de Ven
> Subject: Re: [nfsv4] RE: Finding hardlinks
>=20
>=20
> On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
>> Trond Myklebust wrote:
>>> Exactly where do you see us violating the close-to-open cache=20
>>> consistency guarantees?
>>>
>> I haven't seen that. What I did see is cache inconsistency when
> opening
>> the same file with different file descriptors when the filehandle
> changes.
>> My testing shows that at least fsync and close fail with EIO when the
> filehandle
>> changed while there was dirty data in the cache and that's good.
> Still,
>> not sharing the cache while the file is opened (even on a different
> file
>> descriptors by the same process) seems impractical.
>=20
> Tough. I'm not going to commit to adding support for multiple=20
> filehandles. The fact is that RFC3530 contains masses of rope with=20
> which to allow server and client vendors to hang themselves. The fact=20
> that the protocol claims support for servers that use multiple=20
> filehandles per inode does not mean it is necessarily a good idea. It=20
> adds unnecessary code complexity, it screws with server scalability=20
> (extra GETATTR calls just in order to probe existing filehandles), and

> it is insufficiently well documented in the RFC: SECINFO information=20
> is, for instance, given out on a per-filehandle basis, does that mean=20
> that the server will have different security policies? In some places,

> people haven't even started to think about the consequences:
>=20
>       If GETATTR directed to the two filehandles does not return the
>       fileid attribute for both of the handles, then it cannot be
>       determined whether the two objects are the same.  Therefore,
>       operations which depend on that knowledge (e.g., client side
data
>       caching) cannot be done reliably.
>=20
> This implies the combination is legal, but offers no indication as to=20
> how you would match OPEN/CLOSE requests via different paths. AFAICS=20
> you would have to do non-cached I/O with no share modes (i.e.=20
> NFSv3-style "special" stateids). There is no way in hell we will ever=20
> support non-cached I/O in NFS other than the special case of O_DIRECT.
>=20
>=20
> ...and no, I'm certainly not interested in "fixing" the RFC on this=20
> point in any way other than getting this crap dropped from the spec. I

> see no use for it at all.
>=20
> Trond
>=20
>=20
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www1.ietf.org/mailman/listinfo/nfsv4

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-15 12:53 RE: Finding hardlinks Noveck, Dave
@ 2007-01-16  6:06 ` Spencer Shepler
  2007-01-16  6:16   ` [NFS] " Benny Halevy
  0 siblings, 1 reply; 20+ messages in thread
From: Spencer Shepler @ 2007-01-16  6:06 UTC (permalink / raw)
  To: nfsv4; +Cc: Benny Halevy, Spencer Shepler, nfs, Trond Myklebust


I won't be at connectathon either.  

btw: we do have 2.5 hours scheduled for Prague. :-)


On Mon, Noveck, Dave wrote:
> I'm not going to be at Connectathon, but I could call in for a
> discussion. 
> 
> -----Original Message-----
> From: Benny Halevy [mailto:bhalevy@panasas.com] 
> Sent: Monday, January 15, 2007 3:45 AM
> To: Noveck, Dave; Trond Myklebust
> Cc: Spencer Shepler; nfsv4@ietf.org; nfs@lists.sourceforge.net
> Subject: Re: [nfsv4] RE: Finding hardlinks
> 
> How about discussing this topic in the upcoming Connectathon?
> 
> Benny
> 
> Noveck, Dave wrote:
> > For now, I'm not going to address the controversial issues here, 
> > mainly because I haven't decided how I feel about them yet.
> > 
> >      Whether allowing multiple filehandles per object is a good
> >      or even reasonably acceptable idea.
> > 
> >      What the fact that RFC3530 talks about implies about what
> >      clients should do about the issue.
> > 
> > One thing that I hope is not controversial is that the v4.1 spec 
> > should either get rid of this or make it clear and implementable.
> > I expect plenty of controversy about which of those to choose, but 
> > hope that there isn't any about the proposition that we have to choose
> 
> > one of those two.
> > 
> >> SECINFO information is, for instance, given out on a per-filehandle 
> >> basis, does that mean that the server will
> > have
> >> different security policies? 
> > 
> > Well yes, RFC3530 does say "The new SECINFO operation will allow the 
> > client to determine, on a per filehandle basis", but I think that just
> 
> > has to be considered as an error rather than indicating that if you 
> > have two different filehandles for the same object, they can have 
> > different security policies.  SECINFO in RFC3530 takes a directory fh 
> > and a name, so if there are multiple filehandles for the object with 
> > that name, there is no way for SECINFO to associate different policies
> 
> > with different filehandles.  All it has is the name to go by.  I think
> 
> > this should be corrected to "on a per-object basis" in the new spec no
> 
> > matter what we do on other issues.
> > 
> > I think the principle here has to be that if we do allow multiple fh's
> 
> > to map to the same object, we require that they designate the same 
> > object, and thus it is not allowed for the server to act as if you 
> > have multiple different object with different characteristics.
> > 
> > Similarly as to:
> > 
> >> In some places, people haven't even started to think about the 
> >> consequences:
> >>
> >>     If GETATTR directed to the two filehandles does not return the
> >>     fileid attribute for both of the handles, then it cannot be
> >>     determined whether the two objects are the same.  Therefore,
> >>     operations which depend on that knowledge (e.g., client side data
> >>     caching) cannot be done reliably.
> > 
> > I think they (and maybe "they" includes me, I haven't checked the 
> > history
> > here) started to think about them, but went in a bad direction.
> > 
> > The implication here that you can have a different set of attributes 
> > supported for the same object based on which filehandle is used to 
> > access the attributes is totally bogus.
> > 
> > The definition of supp_attr says "The bit vector which would retrieve 
> > all mandatory and recommended attributes that are supported for this 
> > object.  The scope of this attribute applies to all objects with a 
> > matching fsid."  So having the same object have different attributes 
> > supported based on the filehandle used or even two objects in the same
> 
> > fs having different attributes supported, in particular having fileid 
> > supported for one and not the other just isn't valid.
> > 
> >> The fact is that RFC3530 contains masses of rope with which to allow 
> >> server and client vendors to hang themselves.
> > 
> > If that means simply making poor choices, then OK.  But if there are 
> > other cases where you feel that the specification of a feature is 
> > simply
> > 
> > incoherent and the consequences not really thought out, then I think 
> > we need to discuss them and not propagate that state of affairs to
> v4.1.
> > 
> > -----Original Message-----
> > From: Trond Myklebust [mailto:trond.myklebust@fys.uio.no]
> > Sent: Friday, January 05, 2007 5:29 AM
> > To: Benny Halevy
> > Cc: Jan Harkes; Miklos Szeredi; nfsv4@ietf.org; 
> > linux-kernel@vger.kernel.org; Mikulas Patocka; 
> > linux-fsdevel@vger.kernel.org; Jeff Layton; Arjan van de Ven
> > Subject: Re: [nfsv4] RE: Finding hardlinks
> > 
> > 
> > On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
> >> Trond Myklebust wrote:
> >>> Exactly where do you see us violating the close-to-open cache 
> >>> consistency guarantees?
> >>>
> >> I haven't seen that. What I did see is cache inconsistency when
> > opening
> >> the same file with different file descriptors when the filehandle
> > changes.
> >> My testing shows that at least fsync and close fail with EIO when the
> > filehandle
> >> changed while there was dirty data in the cache and that's good.
> > Still,
> >> not sharing the cache while the file is opened (even on a different
> > file
> >> descriptors by the same process) seems impractical.
> > 
> > Tough. I'm not going to commit to adding support for multiple 
> > filehandles. The fact is that RFC3530 contains masses of rope with 
> > which to allow server and client vendors to hang themselves. The fact 
> > that the protocol claims support for servers that use multiple 
> > filehandles per inode does not mean it is necessarily a good idea. It 
> > adds unnecessary code complexity, it screws with server scalability 
> > (extra GETATTR calls just in order to probe existing filehandles), and
> 
> > it is insufficiently well documented in the RFC: SECINFO information 
> > is, for instance, given out on a per-filehandle basis, does that mean 
> > that the server will have different security policies? In some places,
> 
> > people haven't even started to think about the consequences:
> > 
> >       If GETATTR directed to the two filehandles does not return the
> >       fileid attribute for both of the handles, then it cannot be
> >       determined whether the two objects are the same.  Therefore,
> >       operations which depend on that knowledge (e.g., client side
> data
> >       caching) cannot be done reliably.
> > 
> > This implies the combination is legal, but offers no indication as to 
> > how you would match OPEN/CLOSE requests via different paths. AFAICS 
> > you would have to do non-cached I/O with no share modes (i.e. 
> > NFSv3-style "special" stateids). There is no way in hell we will ever 
> > support non-cached I/O in NFS other than the special case of O_DIRECT.
> > 
> > 
> > ...and no, I'm certainly not interested in "fixing" the RFC on this 
> > point in any way other than getting this crap dropped from the spec. I
> 
> > see no use for it at all.
> > 
> > Trond
> > 
> > 
> > _______________________________________________
> > nfsv4 mailing list
> > nfsv4@ietf.org
> > https://www1.ietf.org/mailman/listinfo/nfsv4
> 
> -------------------------------------------------------------------------
> Take Surveys. Earn Cash. Influence the Future of IT
> Join SourceForge.net's Techsay panel and you'll get the chance to share your
> opinions on IT & business topics through brief surveys - and earn cash
> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
> _______________________________________________
> NFS maillist  -  NFS@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist  -  NFS@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [NFS] RE: Finding hardlinks
  2007-01-16  6:06 ` [nfsv4] " Spencer Shepler
@ 2007-01-16  6:16   ` Benny Halevy
  0 siblings, 0 replies; 20+ messages in thread
From: Benny Halevy @ 2007-01-16  6:16 UTC (permalink / raw)
  To: nfsv4, Benny Halevy, Trond Myklebust, Spencer Shepler, nfs

Good.  I plan to be in Prague.
Given that we should continue the discussion over email
and present a summary and possibly a proposal in Prague.

Benny

Spencer Shepler wrote:
> I won't be at connectathon either.  
> 
> btw: we do have 2.5 hours scheduled for Prague. :-)
> 
> 
> On Mon, Noveck, Dave wrote:
>> I'm not going to be at Connectathon, but I could call in for a
>> discussion. 
>>
>> -----Original Message-----
>> From: Benny Halevy [mailto:bhalevy@panasas.com] 
>> Sent: Monday, January 15, 2007 3:45 AM
>> To: Noveck, Dave; Trond Myklebust
>> Cc: Spencer Shepler; nfsv4@ietf.org; nfs@lists.sourceforge.net
>> Subject: Re: [nfsv4] RE: Finding hardlinks
>>
>> How about discussing this topic in the upcoming Connectathon?
>>
>> Benny
>>
>> Noveck, Dave wrote:
>>> For now, I'm not going to address the controversial issues here, 
>>> mainly because I haven't decided how I feel about them yet.
>>>
>>>      Whether allowing multiple filehandles per object is a good
>>>      or even reasonably acceptable idea.
>>>
>>>      What the fact that RFC3530 talks about implies about what
>>>      clients should do about the issue.
>>>
>>> One thing that I hope is not controversial is that the v4.1 spec 
>>> should either get rid of this or make it clear and implementable.
>>> I expect plenty of controversy about which of those to choose, but 
>>> hope that there isn't any about the proposition that we have to choose
>>> one of those two.
>>>
>>>> SECINFO information is, for instance, given out on a per-filehandle 
>>>> basis, does that mean that the server will
>>> have
>>>> different security policies? 
>>> Well yes, RFC3530 does say "The new SECINFO operation will allow the 
>>> client to determine, on a per filehandle basis", but I think that just
>>> has to be considered as an error rather than indicating that if you 
>>> have two different filehandles for the same object, they can have 
>>> different security policies.  SECINFO in RFC3530 takes a directory fh 
>>> and a name, so if there are multiple filehandles for the object with 
>>> that name, there is no way for SECINFO to associate different policies
>>> with different filehandles.  All it has is the name to go by.  I think
>>> this should be corrected to "on a per-object basis" in the new spec no
>>> matter what we do on other issues.
>>>
>>> I think the principle here has to be that if we do allow multiple fh's
>>> to map to the same object, we require that they designate the same 
>>> object, and thus it is not allowed for the server to act as if you 
>>> have multiple different object with different characteristics.
>>>
>>> Similarly as to:
>>>
>>>> In some places, people haven't even started to think about the 
>>>> consequences:
>>>>
>>>>     If GETATTR directed to the two filehandles does not return the
>>>>     fileid attribute for both of the handles, then it cannot be
>>>>     determined whether the two objects are the same.  Therefore,
>>>>     operations which depend on that knowledge (e.g., client side data
>>>>     caching) cannot be done reliably.
>>> I think they (and maybe "they" includes me, I haven't checked the 
>>> history
>>> here) started to think about them, but went in a bad direction.
>>>
>>> The implication here that you can have a different set of attributes 
>>> supported for the same object based on which filehandle is used to 
>>> access the attributes is totally bogus.
>>>
>>> The definition of supp_attr says "The bit vector which would retrieve 
>>> all mandatory and recommended attributes that are supported for this 
>>> object.  The scope of this attribute applies to all objects with a 
>>> matching fsid."  So having the same object have different attributes 
>>> supported based on the filehandle used or even two objects in the same
>>> fs having different attributes supported, in particular having fileid 
>>> supported for one and not the other just isn't valid.
>>>
>>>> The fact is that RFC3530 contains masses of rope with which to allow 
>>>> server and client vendors to hang themselves.
>>> If that means simply making poor choices, then OK.  But if there are 
>>> other cases where you feel that the specification of a feature is 
>>> simply
>>>
>>> incoherent and the consequences not really thought out, then I think 
>>> we need to discuss them and not propagate that state of affairs to
>> v4.1.
>>> -----Original Message-----
>>> From: Trond Myklebust [mailto:trond.myklebust@fys.uio.no]
>>> Sent: Friday, January 05, 2007 5:29 AM
>>> To: Benny Halevy
>>> Cc: Jan Harkes; Miklos Szeredi; nfsv4@ietf.org; 
>>> linux-kernel@vger.kernel.org; Mikulas Patocka; 
>>> linux-fsdevel@vger.kernel.org; Jeff Layton; Arjan van de Ven
>>> Subject: Re: [nfsv4] RE: Finding hardlinks
>>>
>>>
>>> On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
>>>> Trond Myklebust wrote:
>>>>> Exactly where do you see us violating the close-to-open cache 
>>>>> consistency guarantees?
>>>>>
>>>> I haven't seen that. What I did see is cache inconsistency when
>>> opening
>>>> the same file with different file descriptors when the filehandle
>>> changes.
>>>> My testing shows that at least fsync and close fail with EIO when the
>>> filehandle
>>>> changed while there was dirty data in the cache and that's good.
>>> Still,
>>>> not sharing the cache while the file is opened (even on a different
>>> file
>>>> descriptors by the same process) seems impractical.
>>> Tough. I'm not going to commit to adding support for multiple 
>>> filehandles. The fact is that RFC3530 contains masses of rope with 
>>> which to allow server and client vendors to hang themselves. The fact 
>>> that the protocol claims support for servers that use multiple 
>>> filehandles per inode does not mean it is necessarily a good idea. It 
>>> adds unnecessary code complexity, it screws with server scalability 
>>> (extra GETATTR calls just in order to probe existing filehandles), and
>>> it is insufficiently well documented in the RFC: SECINFO information 
>>> is, for instance, given out on a per-filehandle basis, does that mean 
>>> that the server will have different security policies? In some places,
>>> people haven't even started to think about the consequences:
>>>
>>>       If GETATTR directed to the two filehandles does not return the
>>>       fileid attribute for both of the handles, then it cannot be
>>>       determined whether the two objects are the same.  Therefore,
>>>       operations which depend on that knowledge (e.g., client side
>> data
>>>       caching) cannot be done reliably.
>>>
>>> This implies the combination is legal, but offers no indication as to 
>>> how you would match OPEN/CLOSE requests via different paths. AFAICS 
>>> you would have to do non-cached I/O with no share modes (i.e. 
>>> NFSv3-style "special" stateids). There is no way in hell we will ever 
>>> support non-cached I/O in NFS other than the special case of O_DIRECT.
>>>
>>>
>>> ...and no, I'm certainly not interested in "fixing" the RFC on this 
>>> point in any way other than getting this crap dropped from the spec. I
>>> see no use for it at all.
>>>
>>> Trond
>>>
>>>
>>> _______________________________________________
>>> nfsv4 mailing list
>>> nfsv4@ietf.org
>>> https://www1.ietf.org/mailman/listinfo/nfsv4
>> -------------------------------------------------------------------------
>> Take Surveys. Earn Cash. Influence the Future of IT
>> Join SourceForge.net's Techsay panel and you'll get the chance to share your
>> opinions on IT & business topics through brief surveys - and earn cash
>> http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
>> _______________________________________________
>> NFS maillist  -  NFS@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/nfs


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-05 16:40                                 ` Nicolas Williams
  2007-01-05 16:56                                   ` Trond Myklebust
  2007-01-06  7:44                                   ` Halevy, Benny
@ 2007-01-10 13:04                                   ` Benny Halevy
  2 siblings, 0 replies; 20+ messages in thread
From: Benny Halevy @ 2007-01-10 13:04 UTC (permalink / raw)
  To: Benny Halevy, Trond Myklebust, Jan Harkes, Miklos Szeredi, nfsv4,
	linux-kernel, Mikulas Patocka, linux-fsdevel, Jeff Layton,
	Arjan van de Ven

Nicolas Williams wrote:
> On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
>> I agree that the way the client implements its cache is out of the protocol
>> scope. But how do you interpret "correct behavior" in section 4.2.1?
>>  "Clients MUST use filehandle comparisons only to improve performance, not for correct behavior. All clients need to be prepared for situations in which it cannot be determined whether two filehandles denote the same object and in such cases, avoid making invalid assumptions which might cause incorrect behavior."
>> Don't you consider data corruption due to cache inconsistency an incorrect behavior?
> 
> If a file with multiple hardlinks appears to have multiple distinct
> filehandles then a client like Trond's will treat it as multiple
> distinct files (with the same hardlink count, and you won't be able to
> find the other links to them -- oh well).  Can this cause data
> corruption?  Yes, but only if there are applications that rely on the
> different file names referencing the same file, and backup apps on the
> client won't get the hardlinks right either.

The case I'm discussing is multiple filehandles for the same name,
not even for different hardlinks.  This causes spurious EIO errors
on the client when the filehandle changes and cache inconsistency
when opening the file multiple times in parallel.

> 
> What I don't understand is why getting the fileid is so hard -- always
> GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
> difficult as it is to maintain a hash table of fileids.

It's not difficult at all, just that the client can't rely on the fileids to be
unique in both space and time because of server non-compliance (e.g. netapp's
snapshots) and fileid reuse after delete.



^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [nfsv4] RE: Finding hardlinks
  2007-01-05 16:40                                 ` Nicolas Williams
  2007-01-05 16:56                                   ` Trond Myklebust
@ 2007-01-06  7:44                                   ` Halevy, Benny
  2007-01-10 13:04                                   ` Benny Halevy
  2 siblings, 0 replies; 20+ messages in thread
From: Halevy, Benny @ 2007-01-06  7:44 UTC (permalink / raw)
  To: Nicolas Williams
  Cc: Trond Myklebust, Jan Harkes, Miklos Szeredi, nfsv4, linux-kernel,
	Mikulas Patocka, linux-fsdevel, Jeff Layton, Arjan van de Ven

> From: linux-fsdevel-owner@vger.kernel.org on behalf of Nicolas Williams
> Sent: Fri 1/5/2007 18:40
> To: Halevy, Benny
> Cc: Trond Myklebust; Jan Harkes; Miklos Szeredi; nfsv4@ietf.org; linux-kernel@vger.kernel.org; Mikulas Patocka; linux-fsdevel@vger.kernel.org; Jeff Layton; Arjan van de Ven
> Subject: Re: [nfsv4] RE: Finding hardlinks
> 
> On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
> > I agree that the way the client implements its cache is out of the protocol
> > scope. But how do you interpret "correct behavior" in section 4.2.1?
> >  "Clients MUST use filehandle comparisons only to improve performance, not for correct behavior. All clients > need to be prepared for situations in which it cannot be determined whether two filehandles denote the same > object and in such cases, avoid making invalid assumptions which might cause incorrect behavior."
> > Don't you consider data corruption due to cache inconsistency an incorrect behavior?
> 
> If a file with multiple hardlinks appears to have multiple distinct
> filehandles then a client like Trond's will treat it as multiple
> distinct files (with the same hardlink count, and you won't be able to
> find the other links to them -- oh well).  Can this cause data
> corruption?  Yes, but only if there are applications that rely on the
> different file names referencing the same file, and backup apps on the
> client won't get the hardlinks right either.

Well, this is why the hard links were made, no?
FWIW, I believe that rename of an open file might also produce this problem.


> 
> What I don't understand is why getting the fileid is so hard -- always
> GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
> difficult as it is to maintain a hash table of fileids.


The problem with NFS is that fileid isn't enough because the client doesn't
know about removes by other clients until it uses the stale filehandle.
Also, quite a few file systems are not keeping fileids unique (this triggered
this thread)
 
> 
> Nico
> --


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [nfsv4] RE: Finding hardlinks
@ 2007-01-05 17:24 Noveck, Dave
  0 siblings, 0 replies; 20+ messages in thread
From: Noveck, Dave @ 2007-01-05 17:24 UTC (permalink / raw)
  To: Trond Myklebust, Benny Halevy
  Cc: Jan Harkes, Miklos Szeredi, nfsv4, linux-kernel, Mikulas Patocka,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

For now, I'm not going to address the controversial issues here,
mainly because I haven't decided how I feel about them yet.

     Whether allowing multiple filehandles per object is a good
     or even reasonably acceptable idea.

     What the fact that RFC3530 talks about implies about what
     clients should do about the issue.

One thing that I hope is not controversial is that the v4.1 spec
should either get rid of this or make it clear and implementable.
I expect plenty of controversy about which of those to choose, but
hope that there isn't any about the proposition that we have to 
choose one of those two.

> SECINFO information is, for instance, given
> out on a per-filehandle basis, does that mean that the server will
have
> different security policies? 

Well yes, RFC3530 does say "The new SECINFO operation will allow the 
client to determine, on a per filehandle basis", but I think that
just has to be considered as an error rather than indicating that if
you have two different filehandles for the same object, they can have 
different security policies.  SECINFO in RFC3530 takes a directory fh
and a name, so if there are multiple filehandles for the object with
that name, there is no way for SECINFO to associate different policies
with different filehandles.  All it has is the name to go by.  I think
this should be corrected to "on a per-object basis" in the new spec no 
matter what we do on other issues.

I think the principle here has to be that if we do allow multiple 
fh's to map to the same object, we require that they designate the 
same object, and thus it is not allowed for the server to act as if 
you have multiple different object with different characteristics.

Similarly as to:

> In some places, people haven't even started
> to think about the consequences: 
>
>     If GETATTR directed to the two filehandles does not return the
>     fileid attribute for both of the handles, then it cannot be
>     determined whether the two objects are the same.  Therefore,
>     operations which depend on that knowledge (e.g., client side data
>     caching) cannot be done reliably.

I think they (and maybe "they" includes me, I haven't checked the
history
here) started to think about them, but went in a bad direction.

The implication here that you can have a different set of attributes
supported for the same object based on which filehandle is used to 
access the attributes is totally bogus.

The definition of supp_attr says "The bit vector which would retrieve
all mandatory and recommended attributes that are supported for this 
object.  The scope of this attribute applies to all objects with a
matching fsid."  So having the same object have different attributes
supported based on the filehandle used or even two objects in the same
fs having different attributes supported, in particular having fileid
supported for one and not the other just isn't valid.

> The fact is that RFC3530 contains masses of rope with which
> to allow server and client vendors to hang themselves. 

If that means simply making poor choices, then OK.  But if there are 
other cases where you feel that the specification of a feature is simply

incoherent and the consequences not really thought out, then I think 
we need to discuss them and not propagate that state of affairs to v4.1.

-----Original Message-----
From: Trond Myklebust [mailto:trond.myklebust@fys.uio.no] 
Sent: Friday, January 05, 2007 5:29 AM
To: Benny Halevy
Cc: Jan Harkes; Miklos Szeredi; nfsv4@ietf.org;
linux-kernel@vger.kernel.org; Mikulas Patocka;
linux-fsdevel@vger.kernel.org; Jeff Layton; Arjan van de Ven
Subject: Re: [nfsv4] RE: Finding hardlinks


On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
> Trond Myklebust wrote:
> > Exactly where do you see us violating the close-to-open cache
> > consistency guarantees?
> > 
> 
> I haven't seen that. What I did see is cache inconsistency when
opening
> the same file with different file descriptors when the filehandle
changes.
> My testing shows that at least fsync and close fail with EIO when the
filehandle
> changed while there was dirty data in the cache and that's good.
Still,
> not sharing the cache while the file is opened (even on a different
file
> descriptors by the same process) seems impractical.

Tough. I'm not going to commit to adding support for multiple
filehandles. The fact is that RFC3530 contains masses of rope with which
to allow server and client vendors to hang themselves. The fact that the
protocol claims support for servers that use multiple filehandles per
inode does not mean it is necessarily a good idea. It adds unnecessary
code complexity, it screws with server scalability (extra GETATTR calls
just in order to probe existing filehandles), and it is insufficiently
well documented in the RFC: SECINFO information is, for instance, given
out on a per-filehandle basis, does that mean that the server will have
different security policies? In some places, people haven't even started
to think about the consequences:

      If GETATTR directed to the two filehandles does not return the
      fileid attribute for both of the handles, then it cannot be
      determined whether the two objects are the same.  Therefore,
      operations which depend on that knowledge (e.g., client side data
      caching) cannot be done reliably.

This implies the combination is legal, but offers no indication as to
how you would match OPEN/CLOSE requests via different paths. AFAICS you
would have to do non-cached I/O with no share modes (i.e. NFSv3-style
"special" stateids). There is no way in hell we will ever support
non-cached I/O in NFS other than the special case of O_DIRECT.


...and no, I'm certainly not interested in "fixing" the RFC on this
point in any way other than getting this crap dropped from the spec. I
see no use for it at all.

Trond


_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www1.ietf.org/mailman/listinfo/nfsv4

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-05 16:40                                 ` Nicolas Williams
@ 2007-01-05 16:56                                   ` Trond Myklebust
  2007-01-06  7:44                                   ` Halevy, Benny
  2007-01-10 13:04                                   ` Benny Halevy
  2 siblings, 0 replies; 20+ messages in thread
From: Trond Myklebust @ 2007-01-05 16:56 UTC (permalink / raw)
  To: Nicolas Williams
  Cc: Benny Halevy, Jan Harkes, Miklos Szeredi, nfsv4, linux-kernel,
	Mikulas Patocka, linux-fsdevel, Jeff Layton, Arjan van de Ven

On Fri, 2007-01-05 at 10:40 -0600, Nicolas Williams wrote:
> What I don't understand is why getting the fileid is so hard -- always
> GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
> difficult as it is to maintain a hash table of fileids.

You've been sleeping in class. We always try to get the fileid together
with the GETFH. The irritating bit is having to redo a GETATTR using the
old filehandle in order to figure out if the 2 filehandles refer to the
same file. Unlike filehandles, fileids can be reused.

Then there is the point of dealing with that servers can (and do!)
actually lie to you.

Trond


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-04 10:04                               ` Benny Halevy
  2007-01-04 10:47                                 ` Trond Myklebust
@ 2007-01-05 16:40                                 ` Nicolas Williams
  2007-01-05 16:56                                   ` Trond Myklebust
                                                     ` (2 more replies)
  1 sibling, 3 replies; 20+ messages in thread
From: Nicolas Williams @ 2007-01-05 16:40 UTC (permalink / raw)
  To: Benny Halevy
  Cc: Trond Myklebust, Jan Harkes, Miklos Szeredi, nfsv4, linux-kernel,
	Mikulas Patocka, linux-fsdevel, Jeff Layton, Arjan van de Ven

On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
> I agree that the way the client implements its cache is out of the protocol
> scope. But how do you interpret "correct behavior" in section 4.2.1?
>  "Clients MUST use filehandle comparisons only to improve performance, not for correct behavior. All clients need to be prepared for situations in which it cannot be determined whether two filehandles denote the same object and in such cases, avoid making invalid assumptions which might cause incorrect behavior."
> Don't you consider data corruption due to cache inconsistency an incorrect behavior?

If a file with multiple hardlinks appears to have multiple distinct
filehandles then a client like Trond's will treat it as multiple
distinct files (with the same hardlink count, and you won't be able to
find the other links to them -- oh well).  Can this cause data
corruption?  Yes, but only if there are applications that rely on the
different file names referencing the same file, and backup apps on the
client won't get the hardlinks right either.

What I don't understand is why getting the fileid is so hard -- always
GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
difficult as it is to maintain a hash table of fileids.

Nico
-- 

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-05  8:28                                   ` Benny Halevy
@ 2007-01-05 10:29                                     ` Trond Myklebust
  0 siblings, 0 replies; 20+ messages in thread
From: Trond Myklebust @ 2007-01-05 10:29 UTC (permalink / raw)
  To: Benny Halevy
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
> Trond Myklebust wrote:
> > Exactly where do you see us violating the close-to-open cache
> > consistency guarantees?
> > 
> 
> I haven't seen that. What I did see is cache inconsistency when opening
> the same file with different file descriptors when the filehandle changes.
> My testing shows that at least fsync and close fail with EIO when the filehandle
> changed while there was dirty data in the cache and that's good. Still,
> not sharing the cache while the file is opened (even on a different file
> descriptors by the same process) seems impractical.

Tough. I'm not going to commit to adding support for multiple
filehandles. The fact is that RFC3530 contains masses of rope with which
to allow server and client vendors to hang themselves. The fact that the
protocol claims support for servers that use multiple filehandles per
inode does not mean it is necessarily a good idea. It adds unnecessary
code complexity, it screws with server scalability (extra GETATTR calls
just in order to probe existing filehandles), and it is insufficiently
well documented in the RFC: SECINFO information is, for instance, given
out on a per-filehandle basis, does that mean that the server will have
different security policies? In some places, people haven't even started
to think about the consequences:

      If GETATTR directed to the two filehandles does not return the
      fileid attribute for both of the handles, then it cannot be
      determined whether the two objects are the same.  Therefore,
      operations which depend on that knowledge (e.g., client side data
      caching) cannot be done reliably.

This implies the combination is legal, but offers no indication as to
how you would match OPEN/CLOSE requests via different paths. AFAICS you
would have to do non-cached I/O with no share modes (i.e. NFSv3-style
"special" stateids). There is no way in hell we will ever support
non-cached I/O in NFS other than the special case of O_DIRECT.


...and no, I'm certainly not interested in "fixing" the RFC on this
point in any way other than getting this crap dropped from the spec. I
see no use for it at all.

Trond


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-04 10:47                                 ` Trond Myklebust
  2007-01-04 18:12                                   ` Bryan Henderson
@ 2007-01-05  8:28                                   ` Benny Halevy
  2007-01-05 10:29                                     ` Trond Myklebust
  1 sibling, 1 reply; 20+ messages in thread
From: Benny Halevy @ 2007-01-05  8:28 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

Trond Myklebust wrote:
> On Thu, 2007-01-04 at 12:04 +0200, Benny Halevy wrote:
>> I agree that the way the client implements its cache is out of the protocol
>> scope. But how do you interpret "correct behavior" in section 4.2.1?
>>  "Clients MUST use filehandle comparisons only to improve performance, not for correct behavior. All clients need to be prepared for situations in which it cannot be determined whether two filehandles denote the same object and in such cases, avoid making invalid assumptions which might cause incorrect behavior."
>> Don't you consider data corruption due to cache inconsistency an incorrect behavior?
> 
> Exactly where do you see us violating the close-to-open cache
> consistency guarantees?
> 

I haven't seen that. What I did see is cache inconsistency when opening
the same file with different file descriptors when the filehandle changes.
My testing shows that at least fsync and close fail with EIO when the filehandle
changed while there was dirty data in the cache and that's good. Still,
not sharing the cache while the file is opened (even on a different file
descriptors by the same process) seems impractical.

Benny

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-04 18:12                                   ` Bryan Henderson
@ 2007-01-04 18:26                                     ` Peter Staubach
  0 siblings, 0 replies; 20+ messages in thread
From: Peter Staubach @ 2007-01-04 18:26 UTC (permalink / raw)
  To: Bryan Henderson
  Cc: Trond Myklebust, Arjan van de Ven, Benny Halevy, Jan Harkes,
	Jeff Layton, linux-fsdevel, Miklos Szeredi, Mikulas Patocka,
	nfsv4

Bryan Henderson wrote:
>>>> "Clients MUST use filehandle comparisons only to improve
>>>> performance, not for correct behavior. All clients need to
>>>> be prepared for situations in which it cannot be determined
>>>> whether two filehandles denote the same object and in such
>>>> cases, avoid making invalid assumptions which might cause incorrect 
>>>>         
> behavior."
>   
>>> Don't you consider data corruption due to cache inconsistency an 
>>>       
> incorrect behavior?
>   
>> Exactly where do you see us violating the close-to-open cache
>> consistency guarantees?
>>     
>
> Let me add the information that Trond is implying:  His answer is yes, he 
> doesn't consider data corruption due to cache inconsistency to be 
> incorrect behavior.  And the reason is that, contrary to what one would 
> expect, NFS allows that (for reasons of implementation practicality).  It 
> says when you open a file via an NFS client and read it via that open 
> instance, you can legally see data as old as the moment you opened it. 
> Ergo, you can't use NFS in cases where that would cause unacceptable data 
> corruption.
>
> We normally think of this happening when a different client updates the 
> file, in which case there's no practical way for the reading client to 
> know his cache is stale.  When the updater and reader use the same client, 
> we can do better, but if I'm not mistaken, the NFS protocol does not 
> require us to do so.  And probably more relevant: the user wouldn't expect 
> cache consistency.

This last is especially true, the expectations for use of NFS mounted
file systems are pretty well known and have been set from years of
experience.

A workaround is provided for cooperating processes which need stronger
consistency than the normal guarantees and that is file/record locking.

    Thanx...

       ps

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-04 10:47                                 ` Trond Myklebust
@ 2007-01-04 18:12                                   ` Bryan Henderson
  2007-01-04 18:26                                     ` Peter Staubach
  2007-01-05  8:28                                   ` Benny Halevy
  1 sibling, 1 reply; 20+ messages in thread
From: Bryan Henderson @ 2007-01-04 18:12 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Arjan van de Ven, Benny Halevy, Jan Harkes, Jeff Layton,
	linux-fsdevel, Miklos Szeredi, Mikulas Patocka, nfsv4

>>> "Clients MUST use filehandle comparisons only to improve
>>> performance, not for correct behavior. All clients need to
>>> be prepared for situations in which it cannot be determined
>>> whether two filehandles denote the same object and in such
>>> cases, avoid making invalid assumptions which might cause incorrect 
behavior."
>> Don't you consider data corruption due to cache inconsistency an 
incorrect behavior?
>
>Exactly where do you see us violating the close-to-open cache
>consistency guarantees?

Let me add the information that Trond is implying:  His answer is yes, he 
doesn't consider data corruption due to cache inconsistency to be 
incorrect behavior.  And the reason is that, contrary to what one would 
expect, NFS allows that (for reasons of implementation practicality).  It 
says when you open a file via an NFS client and read it via that open 
instance, you can legally see data as old as the moment you opened it. 
Ergo, you can't use NFS in cases where that would cause unacceptable data 
corruption.

We normally think of this happening when a different client updates the 
file, in which case there's no practical way for the reading client to 
know his cache is stale.  When the updater and reader use the same client, 
we can do better, but if I'm not mistaken, the NFS protocol does not 
require us to do so.  And probably more relevant: the user wouldn't expect 
cache consistency.

--
Bryan Henderson                     IBM Almaden Research Center
San Jose CA                         Filesystems


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-04 10:04                               ` Benny Halevy
@ 2007-01-04 10:47                                 ` Trond Myklebust
  2007-01-04 18:12                                   ` Bryan Henderson
  2007-01-05  8:28                                   ` Benny Halevy
  2007-01-05 16:40                                 ` Nicolas Williams
  1 sibling, 2 replies; 20+ messages in thread
From: Trond Myklebust @ 2007-01-04 10:47 UTC (permalink / raw)
  To: Benny Halevy
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

On Thu, 2007-01-04 at 12:04 +0200, Benny Halevy wrote:
> I agree that the way the client implements its cache is out of the protocol
> scope. But how do you interpret "correct behavior" in section 4.2.1?
>  "Clients MUST use filehandle comparisons only to improve performance, not for correct behavior. All clients need to be prepared for situations in which it cannot be determined whether two filehandles denote the same object and in such cases, avoid making invalid assumptions which might cause incorrect behavior."
> Don't you consider data corruption due to cache inconsistency an incorrect behavior?

Exactly where do you see us violating the close-to-open cache
consistency guarantees?

Trond


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-04  8:36                             ` Trond Myklebust
@ 2007-01-04 10:04                               ` Benny Halevy
  2007-01-04 10:47                                 ` Trond Myklebust
  2007-01-05 16:40                                 ` Nicolas Williams
  0 siblings, 2 replies; 20+ messages in thread
From: Benny Halevy @ 2007-01-04 10:04 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven


Trond Myklebust wrote:
> On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
>> I sincerely expect you or anybody else for this matter to try to provide
>> feedback and object to the protocol specification in case they disagree
>> with it (or think it's ambiguous or self contradicting) rather than ignoring
>> it and implementing something else. I think we're shooting ourselves in the
>> foot when doing so and it is in our common interest to strive to reach a
>> realistic standard we can all comply with and interoperate with each other.
> 
> You are reading the protocol wrong in this case.

Obviously we interpret it differently and that by itself calls for considering
clarification of the text :)

> 
> While the protocol does allow the server to implement the behaviour that
> you've been advocating, it in no way mandates it. Nor does it mandate
> that the client should gather files with the same (fsid,fileid) and
> cache them together. Those are issues to do with _implementation_, and
> are thus beyond the scope of the IETF.
> 
> In our case, the client will ignore the unique_handles attribute. It
> will use filehandles as our inode cache identifier. It will not jump
> through hoops to provide caching semantics that go beyond close-to-open
> for servers that set unique_handles to "false".

I agree that the way the client implements its cache is out of the protocol
scope. But how do you interpret "correct behavior" in section 4.2.1?
 "Clients MUST use filehandle comparisons only to improve performance, not for correct behavior. All clients need to be prepared for situations in which it cannot be determined whether two filehandles denote the same object and in such cases, avoid making invalid assumptions which might cause incorrect behavior."
Don't you consider data corruption due to cache inconsistency an incorrect behavior?

Benny

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-03 12:35                           ` Benny Halevy
  2007-01-04  0:43                             ` Trond Myklebust
@ 2007-01-04  8:36                             ` Trond Myklebust
  2007-01-04 10:04                               ` Benny Halevy
  1 sibling, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2007-01-04  8:36 UTC (permalink / raw)
  To: Benny Halevy
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
> I sincerely expect you or anybody else for this matter to try to provide
> feedback and object to the protocol specification in case they disagree
> with it (or think it's ambiguous or self contradicting) rather than ignoring
> it and implementing something else. I think we're shooting ourselves in the
> foot when doing so and it is in our common interest to strive to reach a
> realistic standard we can all comply with and interoperate with each other.

You are reading the protocol wrong in this case.

While the protocol does allow the server to implement the behaviour that
you've been advocating, it in no way mandates it. Nor does it mandate
that the client should gather files with the same (fsid,fileid) and
cache them together. Those are issues to do with _implementation_, and
are thus beyond the scope of the IETF.

In our case, the client will ignore the unique_handles attribute. It
will use filehandles as our inode cache identifier. It will not jump
through hoops to provide caching semantics that go beyond close-to-open
for servers that set unique_handles to "false".

Trond


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-03 12:35                           ` Benny Halevy
@ 2007-01-04  0:43                             ` Trond Myklebust
  2007-01-04  8:36                             ` Trond Myklebust
  1 sibling, 0 replies; 20+ messages in thread
From: Trond Myklebust @ 2007-01-04  0:43 UTC (permalink / raw)
  To: Benny Halevy
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
> Believe it or not, but server companies like Panasas try to follow the standard
> when designing and implementing their products while relying on client vendors
> to do the same.

I personally have never given a rats arse about "standards" if they make
no sense to me. If the server is capable of knowing about hard links,
then why does it need all this extra crap in the filehandle that just
obfuscates the hard link info?

The bottom line is that nothing in our implementation will result in
such a server performing sub-optimally w.r.t. the client. The only
result is that we will conform to close-to-open semantics instead of
strict POSIX caching semantics when two processes have opened the same
file via different hard links.

> I sincerely expect you or anybody else for this matter to try to provide
> feedback and object to the protocol specification in case they disagree
> with it (or think it's ambiguous or self contradicting) rather than ignoring
> it and implementing something else. I think we're shooting ourselves in the
> foot when doing so and it is in our common interest to strive to reach a
> realistic standard we can all comply with and interoperate with each other.

This has nothing to do with the protocol itself: it has only to do with
caching semantics. As far as caching goes, the only guarantees that NFS
clients give are the close-to-open semantics, and this should indeed be
respected by the implementation in question.

Trond


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2007-01-02 23:21                         ` Trond Myklebust
@ 2007-01-03 12:35                           ` Benny Halevy
  2007-01-04  0:43                             ` Trond Myklebust
  2007-01-04  8:36                             ` Trond Myklebust
  0 siblings, 2 replies; 20+ messages in thread
From: Benny Halevy @ 2007-01-03 12:35 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

Trond Myklebust wrote:
> On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote:
>> Trond Myklebust wrote:
>>>  
>>> On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
>>>> Mikulas Patocka wrote:
>>>>> BTW. how does (or how should?) NFS client deal with cache coherency if 
>>>>> filehandles for the same file differ?
>>>>>
>>>> Trond can probably answer this better than me...
>>>> As I read it, currently the nfs client matches both the fileid and the
>>>> filehandle (in nfs_find_actor). This means that different filehandles
>>>> for the same file would result in different inodes :(.
>>>> Strictly following the nfs protocol, comparing only the fileid should
>>>> be enough IF fileids are indeed unique within the filesystem.
>>>> Comparing the filehandle works as a workaround when the exported filesystem
>>>> (or the nfs server) violates that.  From a user stand point I think that
>>>> this should be configurable, probably per mount point.
>>> Matching files by fileid instead of filehandle is a lot more trouble
>>> since fileids may be reused after a file has been deleted. Every time
>>> you look up a file, and get a new filehandle for the same fileid, you
>>> would at the very least have to do another GETATTR using one of the
>>> 'old' filehandles in order to ensure that the file is the same object as
>>> the one you have cached. Then there is the issue of what to do when you
>>> open(), read() or write() to the file: which filehandle do you use, are
>>> the access permissions the same for all filehandles, ...
>>>
>>> All in all, much pain for little or no gain.
>> See my answer to your previous reply.  It seems like the current
>> implementation is in violation of the nfs protocol and the extra pain
>> is required.
> 
> ...and we should care because...?
> 
> Trond
> 

Believe it or not, but server companies like Panasas try to follow the standard
when designing and implementing their products while relying on client vendors
to do the same.

I sincerely expect you or anybody else for this matter to try to provide
feedback and object to the protocol specification in case they disagree
with it (or think it's ambiguous or self contradicting) rather than ignoring
it and implementing something else. I think we're shooting ourselves in the
foot when doing so and it is in our common interest to strive to reach a
realistic standard we can all comply with and interoperate with each other.

Benny


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [nfsv4] RE: Finding hardlinks
  2006-12-31 21:25                       ` Halevy, Benny
@ 2007-01-02 23:21                         ` Trond Myklebust
  2007-01-03 12:35                           ` Benny Halevy
  0 siblings, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2007-01-02 23:21 UTC (permalink / raw)
  To: Halevy, Benny
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote:
> Trond Myklebust wrote:
> >  
> > On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
> > > Mikulas Patocka wrote:
> > 
> > > >BTW. how does (or how should?) NFS client deal with cache coherency if 
> > > >filehandles for the same file differ?
> > > >
> > > 
> > > Trond can probably answer this better than me...
> > > As I read it, currently the nfs client matches both the fileid and the
> > > filehandle (in nfs_find_actor). This means that different filehandles
> > > for the same file would result in different inodes :(.
> > > Strictly following the nfs protocol, comparing only the fileid should
> > > be enough IF fileids are indeed unique within the filesystem.
> > > Comparing the filehandle works as a workaround when the exported filesystem
> > > (or the nfs server) violates that.  From a user stand point I think that
> > > this should be configurable, probably per mount point.
> > 
> > Matching files by fileid instead of filehandle is a lot more trouble
> > since fileids may be reused after a file has been deleted. Every time
> > you look up a file, and get a new filehandle for the same fileid, you
> > would at the very least have to do another GETATTR using one of the
> > 'old' filehandles in order to ensure that the file is the same object as
> > the one you have cached. Then there is the issue of what to do when you
> > open(), read() or write() to the file: which filehandle do you use, are
> > the access permissions the same for all filehandles, ...
> > 
> > All in all, much pain for little or no gain.
> 
> See my answer to your previous reply.  It seems like the current
> implementation is in violation of the nfs protocol and the extra pain
> is required.

...and we should care because...?

Trond


^ permalink raw reply	[flat|nested] 20+ messages in thread

* RE: [nfsv4] RE: Finding hardlinks
  2006-12-29 10:28                     ` [nfsv4] " Trond Myklebust
@ 2006-12-31 21:25                       ` Halevy, Benny
  2007-01-02 23:21                         ` Trond Myklebust
  0 siblings, 1 reply; 20+ messages in thread
From: Halevy, Benny @ 2006-12-31 21:25 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

Trond Myklebust wrote:
>  
> On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
> > Mikulas Patocka wrote:
> 
> > >BTW. how does (or how should?) NFS client deal with cache coherency if 
> > >filehandles for the same file differ?
> > >
> > 
> > Trond can probably answer this better than me...
> > As I read it, currently the nfs client matches both the fileid and the
> > filehandle (in nfs_find_actor). This means that different filehandles
> > for the same file would result in different inodes :(.
> > Strictly following the nfs protocol, comparing only the fileid should
> > be enough IF fileids are indeed unique within the filesystem.
> > Comparing the filehandle works as a workaround when the exported filesystem
> > (or the nfs server) violates that.  From a user stand point I think that
> > this should be configurable, probably per mount point.
> 
> Matching files by fileid instead of filehandle is a lot more trouble
> since fileids may be reused after a file has been deleted. Every time
> you look up a file, and get a new filehandle for the same fileid, you
> would at the very least have to do another GETATTR using one of the
> 'old' filehandles in order to ensure that the file is the same object as
> the one you have cached. Then there is the issue of what to do when you
> open(), read() or write() to the file: which filehandle do you use, are
> the access permissions the same for all filehandles, ...
> 
> All in all, much pain for little or no gain.

See my answer to your previous reply.  It seems like the current
implementation is in violation of the nfs protocol and the extra pain
is required.

> 
> Most servers therefore take great pains to ensure that clients can use
> filehandles to identify inodes. The exceptions tend to be broken in
> other ways

This is true maybe in linux, but not necessarily in non-linux based nfs
servers.

> (Note: knfsd without the no_subtree_check option is one of
> these exceptions - it can break in the case of cross-directory renames).
> 
> Cheers,
>   Trond



^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [nfsv4] RE: Finding hardlinks
  2006-12-28 20:07                   ` Halevy, Benny
@ 2006-12-29 10:28                     ` Trond Myklebust
  2006-12-31 21:25                       ` Halevy, Benny
  0 siblings, 1 reply; 20+ messages in thread
From: Trond Myklebust @ 2006-12-29 10:28 UTC (permalink / raw)
  To: Halevy, Benny
  Cc: Mikulas Patocka, Jan Harkes, Miklos Szeredi, linux-kernel, nfsv4,
	linux-fsdevel, Jeff Layton, Arjan van de Ven

On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
> Mikulas Patocka wrote:

> >BTW. how does (or how should?) NFS client deal with cache coherency if 
> >filehandles for the same file differ?
> >
> 
> Trond can probably answer this better than me...
> As I read it, currently the nfs client matches both the fileid and the
> filehandle (in nfs_find_actor). This means that different filehandles
> for the same file would result in different inodes :(.
> Strictly following the nfs protocol, comparing only the fileid should
> be enough IF fileids are indeed unique within the filesystem.
> Comparing the filehandle works as a workaround when the exported filesystem
> (or the nfs server) violates that.  From a user stand point I think that
> this should be configurable, probably per mount point.

Matching files by fileid instead of filehandle is a lot more trouble
since fileids may be reused after a file has been deleted. Every time
you look up a file, and get a new filehandle for the same fileid, you
would at the very least have to do another GETATTR using one of the
'old' filehandles in order to ensure that the file is the same object as
the one you have cached. Then there is the issue of what to do when you
open(), read() or write() to the file: which filehandle do you use, are
the access permissions the same for all filehandles, ...

All in all, much pain for little or no gain.

Most servers therefore take great pains to ensure that clients can use
filehandles to identify inodes. The exceptions tend to be broken in
other ways (Note: knfsd without the no_subtree_check option is one of
these exceptions - it can break in the case of cross-directory renames).

Cheers,
  Trond


^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2007-01-16  6:16 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-01-15 12:53 RE: Finding hardlinks Noveck, Dave
2007-01-16  6:06 ` [nfsv4] " Spencer Shepler
2007-01-16  6:16   ` [NFS] " Benny Halevy
  -- strict thread matches above, loose matches on Subject: below --
2007-01-05 17:24 [nfsv4] " Noveck, Dave
2006-12-20  9:03 Mikulas Patocka
2006-12-20 11:44 ` Miklos Szeredi
2006-12-21 18:58   ` Jan Harkes
2006-12-21 23:49     ` Mikulas Patocka
2006-12-23 10:18       ` Arjan van de Ven
2006-12-23 14:00         ` Mikulas Patocka
2006-12-28  9:06           ` Benny Halevy
2006-12-28 13:22             ` Jeff Layton
2006-12-28 15:12               ` Benny Halevy
2006-12-28 18:17                 ` Mikulas Patocka
2006-12-28 20:07                   ` Halevy, Benny
2006-12-29 10:28                     ` [nfsv4] " Trond Myklebust
2006-12-31 21:25                       ` Halevy, Benny
2007-01-02 23:21                         ` Trond Myklebust
2007-01-03 12:35                           ` Benny Halevy
2007-01-04  0:43                             ` Trond Myklebust
2007-01-04  8:36                             ` Trond Myklebust
2007-01-04 10:04                               ` Benny Halevy
2007-01-04 10:47                                 ` Trond Myklebust
2007-01-04 18:12                                   ` Bryan Henderson
2007-01-04 18:26                                     ` Peter Staubach
2007-01-05  8:28                                   ` Benny Halevy
2007-01-05 10:29                                     ` Trond Myklebust
2007-01-05 16:40                                 ` Nicolas Williams
2007-01-05 16:56                                   ` Trond Myklebust
2007-01-06  7:44                                   ` Halevy, Benny
2007-01-10 13:04                                   ` Benny Halevy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.