* "(deleted)" directories @ 2018-11-02 5:26 Malahal Naineni 2018-11-02 11:05 ` Benjamin Coddington 0 siblings, 1 reply; 24+ messages in thread From: Malahal Naineni @ 2018-11-02 5:26 UTC (permalink / raw) To: linux-nfs Hi All, we are using NFS-Ganesha with Linux NFS clients. The client's shell reports the following. Based on lsof, the directory is marked deleted. "cd to ROOT and cd to the same home directory fixes the issue. The client behaves as though the directory is deleted and recreated! Our NFS-Ganesha server implementation uses multiple file handles that point to the same object. NFS spec says this should be fine, but Linux NFS seems to be broken in this regard. tcpdump does indicate file handle change (note that all file handles are permanent, meaning they are valid at the server any time) around this issue time. "shell-init: error retrieving current directory: getcwd: cannot access parent directories: No such file or directory" sh 112544 malahal cwd DIR 0,67 65536 45605209 /home/malahal (deleted) (10.120.154.42:/nfs/malahal-export/) Function nfs_prime_dcache() seems to invalidate the dcache entry if nfs_same_file() returns false. nfs_same_file() does seem to return false with the following change, if I read it correctly, if there is a file handle change. Can this be the source of my issue? It seems that the client should do this only if the file handle is NOT valid (e.g. if it gets ESTALE), right? The following commit seems to assume that the objects are different if they have different file handles! commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 Author: Trond Myklebust <trond.myklebust@primarydata.com> Date: Thu Sep 22 13:38:52 2016 -0400 NFS: Fix inode corruption in nfs_prime_dcache() ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-02 5:26 "(deleted)" directories Malahal Naineni @ 2018-11-02 11:05 ` Benjamin Coddington 2018-11-02 15:54 ` Malahal Naineni 0 siblings, 1 reply; 24+ messages in thread From: Benjamin Coddington @ 2018-11-02 11:05 UTC (permalink / raw) To: Malahal Naineni; +Cc: linux-nfs On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > Hi All, we are using NFS-Ganesha with Linux NFS clients. The client's > shell reports the following. Based on lsof, the directory is marked > deleted. "cd to ROOT and cd to the same home directory fixes the > issue. The client behaves as though the directory is deleted and > recreated! Our NFS-Ganesha server implementation uses multiple file > handles that point to the same object. NFS spec says this should be > fine, but Linux NFS seems to be broken in this regard. tcpdump does > indicate file handle change (note that all file handles are permanent, > meaning they are valid at the server any time) around this issue time. > > "shell-init: error retrieving current directory: getcwd: cannot access > parent directories: No such file or directory" > sh 112544 malahal cwd DIR 0,67 > 65536 45605209 /home/malahal (deleted) > (10.120.154.42:/nfs/malahal-export/) > > Function nfs_prime_dcache() seems to invalidate the dcache entry if > nfs_same_file() returns false. nfs_same_file() does seem to return > false with the following change, if I read it correctly, if there is a > file handle change. Can this be the source of my issue? It seems that > the client should do this only if the file handle is NOT valid (e.g. > if it gets ESTALE), right? > > The following commit seems to assume that the objects are different if > they have different file handles! > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > Author: Trond Myklebust <trond.myklebust@primarydata.com> > Date: Thu Sep 22 13:38:52 2016 -0400 > > NFS: Fix inode corruption in nfs_prime_dcache() My understanding is that for NFSv3 we have to assume that distinct filehandles are distinct objects, but maybe I'm wrong about this. For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 section 10.3.4 to determine if the differing filehandles are the same object, specifically the fileid recommended attribute needs to be implemented. Is Ganesha returning the same fileid for both filehandles? Ben ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-02 11:05 ` Benjamin Coddington @ 2018-11-02 15:54 ` Malahal Naineni 2018-11-02 20:26 ` Trond Myklebust 0 siblings, 1 reply; 24+ messages in thread From: Malahal Naineni @ 2018-11-02 15:54 UTC (permalink / raw) To: bcodding; +Cc: linux-nfs Ben, NFSv3 RFC1813.txt states: "If two file handles from the same server are equal, they must refer to the same file, but if they are not equal, no conclusions can be drawn." Ganesha does return same fileid here (inode). In NFSv4, they have introduced "unique_handles" attribute. I don't see Linux NFS client using this at all though. Regards, Malahal. On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington <bcodding@redhat.com> wrote: > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. The client's > > shell reports the following. Based on lsof, the directory is marked > > deleted. "cd to ROOT and cd to the same home directory fixes the > > issue. The client behaves as though the directory is deleted and > > recreated! Our NFS-Ganesha server implementation uses multiple file > > handles that point to the same object. NFS spec says this should be > > fine, but Linux NFS seems to be broken in this regard. tcpdump does > > indicate file handle change (note that all file handles are permanent, > > meaning they are valid at the server any time) around this issue time. > > > > "shell-init: error retrieving current directory: getcwd: cannot access > > parent directories: No such file or directory" > > sh 112544 malahal cwd DIR 0,67 > > 65536 45605209 /home/malahal (deleted) > > (10.120.154.42:/nfs/malahal-export/) > > > > Function nfs_prime_dcache() seems to invalidate the dcache entry if > > nfs_same_file() returns false. nfs_same_file() does seem to return > > false with the following change, if I read it correctly, if there is a > > file handle change. Can this be the source of my issue? It seems that > > the client should do this only if the file handle is NOT valid (e.g. > > if it gets ESTALE), right? > > > > The following commit seems to assume that the objects are different if > > they have different file handles! > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > NFS: Fix inode corruption in nfs_prime_dcache() > > My understanding is that for NFSv3 we have to assume that distinct > filehandles are distinct objects, but maybe I'm wrong about this. > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 section 10.3.4 > to determine if the differing filehandles are the same object, specifically > the fileid recommended attribute needs to be implemented. Is Ganesha > returning the same fileid for both filehandles? > > Ben ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-02 15:54 ` Malahal Naineni @ 2018-11-02 20:26 ` Trond Myklebust 2018-11-02 22:07 ` Matt Benjamin 2018-11-03 5:00 ` Malahal Naineni 0 siblings, 2 replies; 24+ messages in thread From: Trond Myklebust @ 2018-11-02 20:26 UTC (permalink / raw) To: bcodding, malahal; +Cc: linux-nfs On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > Ben, NFSv3 RFC1813.txt states: "If two file handles from the same > server are equal, they must refer to the same file, but if they are > not equal, no conclusions can be drawn." Ganesha does return same > fileid here (inode). > > In NFSv4, they have introduced "unique_handles" attribute. I don't > see > Linux NFS client using this at all though. Why does your server need to have multiple filehandles refer to the same file, and why do you expect clients to support this? Yes, the spec allows it, but that's not a sufficient reason. > > Regards, Malahal. > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > bcodding@redhat.com> wrote: > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. The > > > client's > > > shell reports the following. Based on lsof, the directory is > > > marked > > > deleted. "cd to ROOT and cd to the same home directory fixes the > > > issue. The client behaves as though the directory is deleted and > > > recreated! Our NFS-Ganesha server implementation uses multiple > > > file > > > handles that point to the same object. NFS spec says this should > > > be > > > fine, but Linux NFS seems to be broken in this regard. tcpdump > > > does > > > indicate file handle change (note that all file handles are > > > permanent, > > > meaning they are valid at the server any time) around this issue > > > time. > > > > > > "shell-init: error retrieving current directory: getcwd: cannot > > > access > > > parent directories: No such file or directory" > > > sh 112544 malahal cwd DIR > > > 0,67 > > > 65536 45605209 /home/malahal (deleted) > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > Function nfs_prime_dcache() seems to invalidate the dcache entry > > > if > > > nfs_same_file() returns false. nfs_same_file() does seem to > > > return > > > false with the following change, if I read it correctly, if there > > > is a > > > file handle change. Can this be the source of my issue? It seems > > > that > > > the client should do this only if the file handle is NOT valid > > > (e.g. > > > if it gets ESTALE), right? > > > > > > The following commit seems to assume that the objects are > > > different if > > > they have different file handles! > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > > > > My understanding is that for NFSv3 we have to assume that distinct > > filehandles are distinct objects, but maybe I'm wrong about this. > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 > > section 10.3.4 > > to determine if the differing filehandles are the same object, > > specifically > > the fileid recommended attribute needs to be implemented. Is > > Ganesha > > returning the same fileid for both filehandles? > > > > Ben -- Trond Myklebust CTO, Hammerspace Inc 4300 El Camino Real, Suite 105 Los Altos, CA 94022 www.hammer.space ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-02 20:26 ` Trond Myklebust @ 2018-11-02 22:07 ` Matt Benjamin 2018-11-03 0:15 ` Trond Myklebust 2018-11-03 5:00 ` Malahal Naineni 1 sibling, 1 reply; 24+ messages in thread From: Matt Benjamin @ 2018-11-02 22:07 UTC (permalink / raw) To: Trond Myklebust; +Cc: bcodding, malahal, linux-nfs It sounds like a pretty good one, that goes to the heart of what a specification is Matt On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust <trondmy@hammerspace.com> wrote: > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: >> Ben, NFSv3 RFC1813.txt states: "If two file handles from the same >> server are equal, they must refer to the same file, but if they are >> not equal, no conclusions can be drawn." Ganesha does return same >> fileid here (inode). >> >> In NFSv4, they have introduced "unique_handles" attribute. I don't >> see >> Linux NFS client using this at all though. > > Why does your server need to have multiple filehandles refer to the > same file, and why do you expect clients to support this? > > Yes, the spec allows it, but that's not a sufficient reason. > >> >> Regards, Malahal. >> On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < >> bcodding@redhat.com> wrote: >> > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: >> > >> > > Hi All, we are using NFS-Ganesha with Linux NFS clients. The >> > > client's >> > > shell reports the following. Based on lsof, the directory is >> > > marked >> > > deleted. "cd to ROOT and cd to the same home directory fixes the >> > > issue. The client behaves as though the directory is deleted and >> > > recreated! Our NFS-Ganesha server implementation uses multiple >> > > file >> > > handles that point to the same object. NFS spec says this should >> > > be >> > > fine, but Linux NFS seems to be broken in this regard. tcpdump >> > > does >> > > indicate file handle change (note that all file handles are >> > > permanent, >> > > meaning they are valid at the server any time) around this issue >> > > time. >> > > >> > > "shell-init: error retrieving current directory: getcwd: cannot >> > > access >> > > parent directories: No such file or directory" >> > > sh 112544 malahal cwd DIR >> > > 0,67 >> > > 65536 45605209 /home/malahal (deleted) >> > > (10.120.154.42:/nfs/malahal-export/) >> > > >> > > Function nfs_prime_dcache() seems to invalidate the dcache entry >> > > if >> > > nfs_same_file() returns false. nfs_same_file() does seem to >> > > return >> > > false with the following change, if I read it correctly, if there >> > > is a >> > > file handle change. Can this be the source of my issue? It seems >> > > that >> > > the client should do this only if the file handle is NOT valid >> > > (e.g. >> > > if it gets ESTALE), right? >> > > >> > > The following commit seems to assume that the objects are >> > > different if >> > > they have different file handles! >> > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 >> > > Author: Trond Myklebust <trond.myklebust@primarydata.com> >> > > Date: Thu Sep 22 13:38:52 2016 -0400 >> > > >> > > NFS: Fix inode corruption in nfs_prime_dcache() >> > >> > My understanding is that for NFSv3 we have to assume that distinct >> > filehandles are distinct objects, but maybe I'm wrong about this. >> > >> > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 >> > section 10.3.4 >> > to determine if the differing filehandles are the same object, >> > specifically >> > the fileid recommended attribute needs to be implemented. Is >> > Ganesha >> > returning the same fileid for both filehandles? >> > >> > Ben > -- > Trond Myklebust > CTO, Hammerspace Inc > 4300 El Camino Real, Suite 105 > Los Altos, CA 94022 > www.hammer.space > > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-02 22:07 ` Matt Benjamin @ 2018-11-03 0:15 ` Trond Myklebust 2018-11-03 2:38 ` Marc Eshel 0 siblings, 1 reply; 24+ messages in thread From: Trond Myklebust @ 2018-11-03 0:15 UTC (permalink / raw) To: mbenjami; +Cc: bcodding, malahal, linux-nfs On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: > It sounds like a pretty good one, that goes to the heart of what a > specification is > While admittedly it is (still) Dia de los Muertos today, I would think that someone who resurrected a part of the NFSv3 spec that has been unused for the full 23 years of its existence might have some explanation for why they did so? IOW: not being of a particularly religious persuasion, I usually want to understand why features are needed rather than having blind faith in the person who wrote the spec. > Matt > > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > trondmy@hammerspace.com> wrote: > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the same > > > server are equal, they must refer to the same file, but if > > > they are > > > not equal, no conclusions can be drawn." Ganesha does return same > > > fileid here (inode). > > > > > > In NFSv4, they have introduced "unique_handles" attribute. I > > > don't > > > see > > > Linux NFS client using this at all though. > > > > Why does your server need to have multiple filehandles refer to the > > same file, and why do you expect clients to support this? > > > > Yes, the spec allows it, but that's not a sufficient reason. > > > > > Regards, Malahal. > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > > > bcodding@redhat.com> wrote: > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > > > > > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. The > > > > > client's > > > > > shell reports the following. Based on lsof, the directory is > > > > > marked > > > > > deleted. "cd to ROOT and cd to the same home directory fixes > > > > > the > > > > > issue. The client behaves as though the directory is deleted > > > > > and > > > > > recreated! Our NFS-Ganesha server implementation uses > > > > > multiple > > > > > file > > > > > handles that point to the same object. NFS spec says this > > > > > should > > > > > be > > > > > fine, but Linux NFS seems to be broken in this regard. > > > > > tcpdump > > > > > does > > > > > indicate file handle change (note that all file handles are > > > > > permanent, > > > > > meaning they are valid at the server any time) around this > > > > > issue > > > > > time. > > > > > > > > > > "shell-init: error retrieving current directory: getcwd: > > > > > cannot > > > > > access > > > > > parent directories: No such file or directory" > > > > > sh 112544 malahal cwd DIR > > > > > 0,67 > > > > > 65536 45605209 /home/malahal (deleted) > > > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > > > > > Function nfs_prime_dcache() seems to invalidate the dcache > > > > > entry > > > > > if > > > > > nfs_same_file() returns false. nfs_same_file() does seem to > > > > > return > > > > > false with the following change, if I read it correctly, if > > > > > there > > > > > is a > > > > > file handle change. Can this be the source of my issue? It > > > > > seems > > > > > that > > > > > the client should do this only if the file handle is NOT > > > > > valid > > > > > (e.g. > > > > > if it gets ESTALE), right? > > > > > > > > > > The following commit seems to assume that the objects are > > > > > different if > > > > > they have different file handles! > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > > > > > > > > My understanding is that for NFSv3 we have to assume that > > > > distinct > > > > filehandles are distinct objects, but maybe I'm wrong about > > > > this. > > > > > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 > > > > section 10.3.4 > > > > to determine if the differing filehandles are the same object, > > > > specifically > > > > the fileid recommended attribute needs to be implemented. Is > > > > Ganesha > > > > returning the same fileid for both filehandles? > > > > > > > > Ben > > -- > > Trond Myklebust > > CTO, Hammerspace Inc > > 4300 El Camino Real, Suite 105 > > Los Altos, CA 94022 > > www.hammer.space > > > > > > -- Trond Myklebust CTO, Hammerspace Inc 4300 El Camino Real, Suite 105 Los Altos, CA 94022 www.hammer.space ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-03 0:15 ` Trond Myklebust @ 2018-11-03 2:38 ` Marc Eshel 2018-11-03 3:27 ` Trond Myklebust 2018-11-04 5:31 ` NeilBrown 0 siblings, 2 replies; 24+ messages in thread From: Marc Eshel @ 2018-11-03 2:38 UTC (permalink / raw) To: Trond Myklebust; +Cc: bcodding, linux-nfs, linux-nfs-owner, malahal, mbenjami One reason to have different FHs for the same file is that a file can be linked from multiple directories. Adding the parent inode to the FH help finding the the name of the file by looking for the file inode in the parent directoy. Marc. linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: > From: Trond Myklebust <trondmy@hammerspace.com> > To: "mbenjami@redhat.com" <mbenjami@redhat.com> > Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com" > <malahal@gmail.com>, "linux-nfs@vger.kernel.org" <linux-nfs@vger.kernel.org> > Date: 11/02/2018 05:15 PM > Subject: Re: "(deleted)" directories > Sent by: linux-nfs-owner@vger.kernel.org > > On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: > > It sounds like a pretty good one, that goes to the heart of what a > > specification is > > > > While admittedly it is (still) Dia de los Muertos today, I would think > that someone who resurrected a part of the NFSv3 spec that has been > unused for the full 23 years of its existence might have some > explanation for why they did so? > > IOW: not being of a particularly religious persuasion, I usually want > to understand why features are needed rather than having blind faith in > the person who wrote the spec. > > > Matt > > > > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > > trondmy@hammerspace.com> wrote: > > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the same > > > > server are equal, they must refer to the same file, but if > > > > they are > > > > not equal, no conclusions can be drawn." Ganesha does return same > > > > fileid here (inode). > > > > > > > > In NFSv4, they have introduced "unique_handles" attribute. I > > > > don't > > > > see > > > > Linux NFS client using this at all though. > > > > > > Why does your server need to have multiple filehandles refer to the > > > same file, and why do you expect clients to support this? > > > > > > Yes, the spec allows it, but that's not a sufficient reason. > > > > > > > Regards, Malahal. > > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > > > > bcodding@redhat.com> wrote: > > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > > > > > > > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. The > > > > > > client's > > > > > > shell reports the following. Based on lsof, the directory is > > > > > > marked > > > > > > deleted. "cd to ROOT and cd to the same home directory fixes > > > > > > the > > > > > > issue. The client behaves as though the directory is deleted > > > > > > and > > > > > > recreated! Our NFS-Ganesha server implementation uses > > > > > > multiple > > > > > > file > > > > > > handles that point to the same object. NFS spec says this > > > > > > should > > > > > > be > > > > > > fine, but Linux NFS seems to be broken in this regard. > > > > > > tcpdump > > > > > > does > > > > > > indicate file handle change (note that all file handles are > > > > > > permanent, > > > > > > meaning they are valid at the server any time) around this > > > > > > issue > > > > > > time. > > > > > > > > > > > > "shell-init: error retrieving current directory: getcwd: > > > > > > cannot > > > > > > access > > > > > > parent directories: No such file or directory" > > > > > > sh 112544 malahal cwd DIR > > > > > > 0,67 > > > > > > 65536 45605209 /home/malahal (deleted) > > > > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > > > > > > > Function nfs_prime_dcache() seems to invalidate the dcache > > > > > > entry > > > > > > if > > > > > > nfs_same_file() returns false. nfs_same_file() does seem to > > > > > > return > > > > > > false with the following change, if I read it correctly, if > > > > > > there > > > > > > is a > > > > > > file handle change. Can this be the source of my issue? It > > > > > > seems > > > > > > that > > > > > > the client should do this only if the file handle is NOT > > > > > > valid > > > > > > (e.g. > > > > > > if it gets ESTALE), right? > > > > > > > > > > > > The following commit seems to assume that the objects are > > > > > > different if > > > > > > they have different file handles! > > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > > > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > > > > > > > > > > My understanding is that for NFSv3 we have to assume that > > > > > distinct > > > > > filehandles are distinct objects, but maybe I'm wrong about > > > > > this. > > > > > > > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 > > > > > section 10.3.4 > > > > > to determine if the differing filehandles are the same object, > > > > > specifically > > > > > the fileid recommended attribute needs to be implemented. Is > > > > > Ganesha > > > > > returning the same fileid for both filehandles? > > > > > > > > > > Ben > > > -- > > > Trond Myklebust > > > CTO, Hammerspace Inc > > > 4300 El Camino Real, Suite 105 > > > Los Altos, CA 94022 > > > www.hammer.space > > > > > > > > > > > -- > Trond Myklebust > CTO, Hammerspace Inc > 4300 El Camino Real, Suite 105 > Los Altos, CA 94022 > www.hammer.space > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-03 2:38 ` Marc Eshel @ 2018-11-03 3:27 ` Trond Myklebust 2018-11-03 3:37 ` Marc Eshel 2018-11-04 5:31 ` NeilBrown 1 sibling, 1 reply; 24+ messages in thread From: Trond Myklebust @ 2018-11-03 3:27 UTC (permalink / raw) To: eshel; +Cc: bcodding, mbenjami, linux-nfs-owner, linux-nfs, malahal On Fri, 2018-11-02 at 18:38 -0800, Marc Eshel wrote: > One reason to have different FHs for the same file is that a file can > be > linked from multiple directories. > Adding the parent inode to the FH help finding the the name of the > file by > looking for the file inode in > the parent directoy. > No... We've been there and done that. Encoding parent directories in the filehandle breaks rename, unlink, and makes it a pain to recover state. There is no way in hell we're going to commit to support that model. > Marc. > > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: > > > From: Trond Myklebust <trondmy@hammerspace.com> > > To: "mbenjami@redhat.com" <mbenjami@redhat.com> > > Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com > > " > > <malahal@gmail.com>, "linux-nfs@vger.kernel.org" > <linux-nfs@vger.kernel.org> > > Date: 11/02/2018 05:15 PM > > Subject: Re: "(deleted)" directories > > Sent by: linux-nfs-owner@vger.kernel.org > > > > On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: > > > It sounds like a pretty good one, that goes to the heart of what > > > a > > > specification is > > > > > > > While admittedly it is (still) Dia de los Muertos today, I would > > think > > that someone who resurrected a part of the NFSv3 spec that has been > > unused for the full 23 years of its existence might have some > > explanation for why they did so? > > > > IOW: not being of a particularly religious persuasion, I usually > > want > > to understand why features are needed rather than having blind > > faith in > > the person who wrote the spec. > > > > > Matt > > > > > > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > > > trondmy@hammerspace.com> wrote: > > > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > > > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the > > > > > same > > > > > server are equal, they must refer to the same file, but if > > > > > they are > > > > > not equal, no conclusions can be drawn." Ganesha does return > > > > > same > > > > > fileid here (inode). > > > > > > > > > > In NFSv4, they have introduced "unique_handles" attribute. I > > > > > don't > > > > > see > > > > > Linux NFS client using this at all though. > > > > > > > > Why does your server need to have multiple filehandles refer to > > > > the > > > > same file, and why do you expect clients to support this? > > > > > > > > Yes, the spec allows it, but that's not a sufficient reason. > > > > > > > > > Regards, Malahal. > > > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > > > > > bcodding@redhat.com> wrote: > > > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > > > > > > > > > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. > > > > > > > The > > > > > > > client's > > > > > > > shell reports the following. Based on lsof, the directory > > > > > > > is > > > > > > > marked > > > > > > > deleted. "cd to ROOT and cd to the same home directory > > > > > > > fixes > > > > > > > the > > > > > > > issue. The client behaves as though the directory is > > > > > > > deleted > > > > > > > and > > > > > > > recreated! Our NFS-Ganesha server implementation uses > > > > > > > multiple > > > > > > > file > > > > > > > handles that point to the same object. NFS spec says this > > > > > > > should > > > > > > > be > > > > > > > fine, but Linux NFS seems to be broken in this regard. > > > > > > > tcpdump > > > > > > > does > > > > > > > indicate file handle change (note that all file handles > > > > > > > are > > > > > > > permanent, > > > > > > > meaning they are valid at the server any time) around > > > > > > > this > > > > > > > issue > > > > > > > time. > > > > > > > > > > > > > > "shell-init: error retrieving current directory: getcwd: > > > > > > > cannot > > > > > > > access > > > > > > > parent directories: No such file or directory" > > > > > > > sh 112544 malahal cwd DIR > > > > > > > 0,67 > > > > > > > 65536 45605209 /home/malahal (deleted) > > > > > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > > > > > > > > > Function nfs_prime_dcache() seems to invalidate the > > > > > > > dcache > > > > > > > entry > > > > > > > if > > > > > > > nfs_same_file() returns false. nfs_same_file() does seem > > > > > > > to > > > > > > > return > > > > > > > false with the following change, if I read it correctly, > > > > > > > if > > > > > > > there > > > > > > > is a > > > > > > > file handle change. Can this be the source of my issue? > > > > > > > It > > > > > > > seems > > > > > > > that > > > > > > > the client should do this only if the file handle is NOT > > > > > > > valid > > > > > > > (e.g. > > > > > > > if it gets ESTALE), right? > > > > > > > > > > > > > > The following commit seems to assume that the objects are > > > > > > > different if > > > > > > > they have different file handles! > > > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > > > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > > > > > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > > > > > > > > > > > > My understanding is that for NFSv3 we have to assume that > > > > > > distinct > > > > > > filehandles are distinct objects, but maybe I'm wrong about > > > > > > this. > > > > > > > > > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or > > > > > > 7530 > > > > > > section 10.3.4 > > > > > > to determine if the differing filehandles are the same > > > > > > object, > > > > > > specifically > > > > > > the fileid recommended attribute needs to be > > > > > > implemented. Is > > > > > > Ganesha > > > > > > returning the same fileid for both filehandles? > > > > > > > > > > > > Ben > > > > -- > > > > Trond Myklebust > > > > CTO, Hammerspace Inc > > > > 4300 El Camino Real, Suite 105 > > > > Los Altos, CA 94022 > > > > www.hammer.space > > > > > > > > > > -- > > Trond Myklebust > > CTO, Hammerspace Inc > > 4300 El Camino Real, Suite 105 > > Los Altos, CA 94022 > > www.hammer.space > > > > > > -- Trond Myklebust CTO, Hammerspace Inc 4300 El Camino Real, Suite 105 Los Altos, CA 94022 www.hammer.space ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-03 3:27 ` Trond Myklebust @ 2018-11-03 3:37 ` Marc Eshel 0 siblings, 0 replies; 24+ messages in thread From: Marc Eshel @ 2018-11-03 3:37 UTC (permalink / raw) To: Trond Myklebust; +Cc: bcodding, linux-nfs, linux-nfs-owner, malahal, mbenjami You asked for a reason and I gave you one, the client shouldn't care for the reason since the FH is opaque to the client. It is the server decision what to encode in the FH. The client should just follow the spec. Marc. linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 08:27:10 PM: > From: Trond Myklebust <trondmy@hammerspace.com> > To: "eshel@us.ibm.com" <eshel@us.ibm.com> > Cc: "bcodding@redhat.com" <bcodding@redhat.com>, > "mbenjami@redhat.com" <mbenjami@redhat.com>, "linux-nfs- > owner@vger.kernel.org" <linux-nfs-owner@vger.kernel.org>, "linux- > nfs@vger.kernel.org" <linux-nfs@vger.kernel.org>, > "malahal@gmail.com" <malahal@gmail.com> > Date: 11/02/2018 08:27 PM > Subject: Re: "(deleted)" directories > Sent by: linux-nfs-owner@vger.kernel.org > > On Fri, 2018-11-02 at 18:38 -0800, Marc Eshel wrote: > > One reason to have different FHs for the same file is that a file can > > be > > linked from multiple directories. > > Adding the parent inode to the FH help finding the the name of the > > file by > > looking for the file inode in > > the parent directoy. > > > > No... We've been there and done that. Encoding parent directories in > the filehandle breaks rename, unlink, and makes it a pain to recover > state. > There is no way in hell we're going to commit to support that model. > > > Marc. > > > > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: > > > > > From: Trond Myklebust <trondmy@hammerspace.com> > > > To: "mbenjami@redhat.com" <mbenjami@redhat.com> > > > Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com > > > " > > > <malahal@gmail.com>, "linux-nfs@vger.kernel.org" > > <linux-nfs@vger.kernel.org> > > > Date: 11/02/2018 05:15 PM > > > Subject: Re: "(deleted)" directories > > > Sent by: linux-nfs-owner@vger.kernel.org > > > > > > On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: > > > > It sounds like a pretty good one, that goes to the heart of what > > > > a > > > > specification is > > > > > > > > > > While admittedly it is (still) Dia de los Muertos today, I would > > > think > > > that someone who resurrected a part of the NFSv3 spec that has been > > > unused for the full 23 years of its existence might have some > > > explanation for why they did so? > > > > > > IOW: not being of a particularly religious persuasion, I usually > > > want > > > to understand why features are needed rather than having blind > > > faith in > > > the person who wrote the spec. > > > > > > > Matt > > > > > > > > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > > > > trondmy@hammerspace.com> wrote: > > > > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > > > > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the > > > > > > same > > > > > > server are equal, they must refer to the same file, but if > > > > > > they are > > > > > > not equal, no conclusions can be drawn." Ganesha does return > > > > > > same > > > > > > fileid here (inode). > > > > > > > > > > > > In NFSv4, they have introduced "unique_handles" attribute. I > > > > > > don't > > > > > > see > > > > > > Linux NFS client using this at all though. > > > > > > > > > > Why does your server need to have multiple filehandles refer to > > > > > the > > > > > same file, and why do you expect clients to support this? > > > > > > > > > > Yes, the spec allows it, but that's not a sufficient reason. > > > > > > > > > > > Regards, Malahal. > > > > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > > > > > > bcodding@redhat.com> wrote: > > > > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > > > > > > > > > > > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. > > > > > > > > The > > > > > > > > client's > > > > > > > > shell reports the following. Based on lsof, the directory > > > > > > > > is > > > > > > > > marked > > > > > > > > deleted. "cd to ROOT and cd to the same home directory > > > > > > > > fixes > > > > > > > > the > > > > > > > > issue. The client behaves as though the directory is > > > > > > > > deleted > > > > > > > > and > > > > > > > > recreated! Our NFS-Ganesha server implementation uses > > > > > > > > multiple > > > > > > > > file > > > > > > > > handles that point to the same object. NFS spec says this > > > > > > > > should > > > > > > > > be > > > > > > > > fine, but Linux NFS seems to be broken in this regard. > > > > > > > > tcpdump > > > > > > > > does > > > > > > > > indicate file handle change (note that all file handles > > > > > > > > are > > > > > > > > permanent, > > > > > > > > meaning they are valid at the server any time) around > > > > > > > > this > > > > > > > > issue > > > > > > > > time. > > > > > > > > > > > > > > > > "shell-init: error retrieving current directory: getcwd: > > > > > > > > cannot > > > > > > > > access > > > > > > > > parent directories: No such file or directory" > > > > > > > > sh 112544 malahal cwd DIR > > > > > > > > 0,67 > > > > > > > > 65536 45605209 /home/malahal (deleted) > > > > > > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > > > > > > > > > > > Function nfs_prime_dcache() seems to invalidate the > > > > > > > > dcache > > > > > > > > entry > > > > > > > > if > > > > > > > > nfs_same_file() returns false. nfs_same_file() does seem > > > > > > > > to > > > > > > > > return > > > > > > > > false with the following change, if I read it correctly, > > > > > > > > if > > > > > > > > there > > > > > > > > is a > > > > > > > > file handle change. Can this be the source of my issue? > > > > > > > > It > > > > > > > > seems > > > > > > > > that > > > > > > > > the client should do this only if the file handle is NOT > > > > > > > > valid > > > > > > > > (e.g. > > > > > > > > if it gets ESTALE), right? > > > > > > > > > > > > > > > > The following commit seems to assume that the objects are > > > > > > > > different if > > > > > > > > they have different file handles! > > > > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > > > > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > > > > > > > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > > > > > > > > > > > > > > My understanding is that for NFSv3 we have to assume that > > > > > > > distinct > > > > > > > filehandles are distinct objects, but maybe I'm wrong about > > > > > > > this. > > > > > > > > > > > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or > > > > > > > 7530 > > > > > > > section 10.3.4 > > > > > > > to determine if the differing filehandles are the same > > > > > > > object, > > > > > > > specifically > > > > > > > the fileid recommended attribute needs to be > > > > > > > implemented. Is > > > > > > > Ganesha > > > > > > > returning the same fileid for both filehandles? > > > > > > > > > > > > > > Ben > > > > > -- > > > > > Trond Myklebust > > > > > CTO, Hammerspace Inc > > > > > 4300 El Camino Real, Suite 105 > > > > > Los Altos, CA 94022 > > > > > www.hammer.space > > > > > > > > > > > > > -- > > > Trond Myklebust > > > CTO, Hammerspace Inc > > > 4300 El Camino Real, Suite 105 > > > Los Altos, CA 94022 > > > www.hammer.space > > > > > > > > > > > -- > Trond Myklebust > CTO, Hammerspace Inc > 4300 El Camino Real, Suite 105 > Los Altos, CA 94022 > www.hammer.space > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-03 2:38 ` Marc Eshel 2018-11-03 3:27 ` Trond Myklebust @ 2018-11-04 5:31 ` NeilBrown 2018-11-04 19:47 ` Marc Eshel 1 sibling, 1 reply; 24+ messages in thread From: NeilBrown @ 2018-11-04 5:31 UTC (permalink / raw) To: Marc Eshel, Trond Myklebust Cc: bcodding, linux-nfs, linux-nfs-owner, malahal, mbenjami [-- Attachment #1: Type: text/plain, Size: 5954 bytes --] On Fri, Nov 02 2018, Marc Eshel wrote: > One reason to have different FHs for the same file is that a file can be > linked from multiple directories. This has some based when considering filehandles for non-directories. However the original problem was with filehandles for directories..... > Adding the parent inode to the FH help finding the the name of the file by > looking for the file inode in > the parent directoy. > ....and directories have a ".." link, obviating the need to store parent information in the filehandle. NeilBrown > Marc. > > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: > >> From: Trond Myklebust <trondmy@hammerspace.com> >> To: "mbenjami@redhat.com" <mbenjami@redhat.com> >> Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com" >> <malahal@gmail.com>, "linux-nfs@vger.kernel.org" > <linux-nfs@vger.kernel.org> >> Date: 11/02/2018 05:15 PM >> Subject: Re: "(deleted)" directories >> Sent by: linux-nfs-owner@vger.kernel.org >> >> On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: >> > It sounds like a pretty good one, that goes to the heart of what a >> > specification is >> > >> >> While admittedly it is (still) Dia de los Muertos today, I would think >> that someone who resurrected a part of the NFSv3 spec that has been >> unused for the full 23 years of its existence might have some >> explanation for why they did so? >> >> IOW: not being of a particularly religious persuasion, I usually want >> to understand why features are needed rather than having blind faith in >> the person who wrote the spec. >> >> > Matt >> > >> > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < >> > trondmy@hammerspace.com> wrote: >> > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: >> > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the same >> > > > server are equal, they must refer to the same file, but if >> > > > they are >> > > > not equal, no conclusions can be drawn." Ganesha does return same >> > > > fileid here (inode). >> > > > >> > > > In NFSv4, they have introduced "unique_handles" attribute. I >> > > > don't >> > > > see >> > > > Linux NFS client using this at all though. >> > > >> > > Why does your server need to have multiple filehandles refer to the >> > > same file, and why do you expect clients to support this? >> > > >> > > Yes, the spec allows it, but that's not a sufficient reason. >> > > >> > > > Regards, Malahal. >> > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < >> > > > bcodding@redhat.com> wrote: >> > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: >> > > > > >> > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. The >> > > > > > client's >> > > > > > shell reports the following. Based on lsof, the directory is >> > > > > > marked >> > > > > > deleted. "cd to ROOT and cd to the same home directory fixes >> > > > > > the >> > > > > > issue. The client behaves as though the directory is deleted >> > > > > > and >> > > > > > recreated! Our NFS-Ganesha server implementation uses >> > > > > > multiple >> > > > > > file >> > > > > > handles that point to the same object. NFS spec says this >> > > > > > should >> > > > > > be >> > > > > > fine, but Linux NFS seems to be broken in this regard. >> > > > > > tcpdump >> > > > > > does >> > > > > > indicate file handle change (note that all file handles are >> > > > > > permanent, >> > > > > > meaning they are valid at the server any time) around this >> > > > > > issue >> > > > > > time. >> > > > > > >> > > > > > "shell-init: error retrieving current directory: getcwd: >> > > > > > cannot >> > > > > > access >> > > > > > parent directories: No such file or directory" >> > > > > > sh 112544 malahal cwd DIR >> > > > > > 0,67 >> > > > > > 65536 45605209 /home/malahal (deleted) >> > > > > > (10.120.154.42:/nfs/malahal-export/) >> > > > > > >> > > > > > Function nfs_prime_dcache() seems to invalidate the dcache >> > > > > > entry >> > > > > > if >> > > > > > nfs_same_file() returns false. nfs_same_file() does seem to >> > > > > > return >> > > > > > false with the following change, if I read it correctly, if >> > > > > > there >> > > > > > is a >> > > > > > file handle change. Can this be the source of my issue? It >> > > > > > seems >> > > > > > that >> > > > > > the client should do this only if the file handle is NOT >> > > > > > valid >> > > > > > (e.g. >> > > > > > if it gets ESTALE), right? >> > > > > > >> > > > > > The following commit seems to assume that the objects are >> > > > > > different if >> > > > > > they have different file handles! >> > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 >> > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> >> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 >> > > > > > >> > > > > > NFS: Fix inode corruption in nfs_prime_dcache() >> > > > > >> > > > > My understanding is that for NFSv3 we have to assume that >> > > > > distinct >> > > > > filehandles are distinct objects, but maybe I'm wrong about >> > > > > this. >> > > > > >> > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 >> > > > > section 10.3.4 >> > > > > to determine if the differing filehandles are the same object, >> > > > > specifically >> > > > > the fileid recommended attribute needs to be implemented. Is >> > > > > Ganesha >> > > > > returning the same fileid for both filehandles? >> > > > > >> > > > > Ben >> > > -- >> > > Trond Myklebust >> > > CTO, Hammerspace Inc >> > > 4300 El Camino Real, Suite 105 >> > > Los Altos, CA 94022 >> > > www.hammer.space >> > > >> > > >> > >> > >> -- >> Trond Myklebust >> CTO, Hammerspace Inc >> 4300 El Camino Real, Suite 105 >> Los Altos, CA 94022 >> www.hammer.space >> >> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-04 5:31 ` NeilBrown @ 2018-11-04 19:47 ` Marc Eshel 2018-11-05 0:32 ` NeilBrown 0 siblings, 1 reply; 24+ messages in thread From: Marc Eshel @ 2018-11-04 19:47 UTC (permalink / raw) To: NeilBrown Cc: bcodding, linux-nfs, linux-nfs-owner, malahal, mbenjami, Trond Myklebust linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 10:31:29 PM: > From: NeilBrown <neilb@suse.com> > To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust <trondmy@hammerspace.com> > Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux-nfs > \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs- > owner@vger.kernel.org, "malahal\@gmail.com" <malahal@gmail.com>, > "mbenjami\@redhat.com" <mbenjami@redhat.com> > Date: 11/03/2018 10:41 PM > Subject: Re: "(deleted)" directories > Sent by: linux-nfs-owner@vger.kernel.org > > On Fri, Nov 02 2018, Marc Eshel wrote: > > > One reason to have different FHs for the same file is that a file can be > > linked from multiple directories. > > This has some based when considering filehandles for non-directories. > However the original problem was with filehandles for directories..... This was just an example of why FH might be different, I don't think we depend on it for the parent information anymore. Malahal listed some other reasons for having different FH for the same file. I believe that Ganesha split the FH to the key portion (the unique id of the file) and some other information that is file system dependent. If the NFS client can not handle the spec definition of FH maybe the spec should be updated to something like Ganesha does. Marc. > > > Adding the parent inode to the FH help finding the the name of the file by > > looking for the file inode in > > the parent directoy. > > > > ....and directories have a ".." link, obviating the need to store parent > information in the filehandle. > > NeilBrown > > > > Marc. > > > > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: > > > >> From: Trond Myklebust <trondmy@hammerspace.com> > >> To: "mbenjami@redhat.com" <mbenjami@redhat.com> > >> Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com" > >> <malahal@gmail.com>, "linux-nfs@vger.kernel.org" > > <linux-nfs@vger.kernel.org> > >> Date: 11/02/2018 05:15 PM > >> Subject: Re: "(deleted)" directories > >> Sent by: linux-nfs-owner@vger.kernel.org > >> > >> On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: > >> > It sounds like a pretty good one, that goes to the heart of what a > >> > specification is > >> > > >> > >> While admittedly it is (still) Dia de los Muertos today, I would think > >> that someone who resurrected a part of the NFSv3 spec that has been > >> unused for the full 23 years of its existence might have some > >> explanation for why they did so? > >> > >> IOW: not being of a particularly religious persuasion, I usually want > >> to understand why features are needed rather than having blind faith in > >> the person who wrote the spec. > >> > >> > Matt > >> > > >> > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > >> > trondmy@hammerspace.com> wrote: > >> > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > >> > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the same > >> > > > server are equal, they must refer to the same file, but if > >> > > > they are > >> > > > not equal, no conclusions can be drawn." Ganesha does return same > >> > > > fileid here (inode). > >> > > > > >> > > > In NFSv4, they have introduced "unique_handles" attribute. I > >> > > > don't > >> > > > see > >> > > > Linux NFS client using this at all though. > >> > > > >> > > Why does your server need to have multiple filehandles refer to the > >> > > same file, and why do you expect clients to support this? > >> > > > >> > > Yes, the spec allows it, but that's not a sufficient reason. > >> > > > >> > > > Regards, Malahal. > >> > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > >> > > > bcodding@redhat.com> wrote: > >> > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > >> > > > > > >> > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. The > >> > > > > > client's > >> > > > > > shell reports the following. Based on lsof, the directory is > >> > > > > > marked > >> > > > > > deleted. "cd to ROOT and cd to the same home directory fixes > >> > > > > > the > >> > > > > > issue. The client behaves as though the directory is deleted > >> > > > > > and > >> > > > > > recreated! Our NFS-Ganesha server implementation uses > >> > > > > > multiple > >> > > > > > file > >> > > > > > handles that point to the same object. NFS spec says this > >> > > > > > should > >> > > > > > be > >> > > > > > fine, but Linux NFS seems to be broken in this regard. > >> > > > > > tcpdump > >> > > > > > does > >> > > > > > indicate file handle change (note that all file handles are > >> > > > > > permanent, > >> > > > > > meaning they are valid at the server any time) around this > >> > > > > > issue > >> > > > > > time. > >> > > > > > > >> > > > > > "shell-init: error retrieving current directory: getcwd: > >> > > > > > cannot > >> > > > > > access > >> > > > > > parent directories: No such file or directory" > >> > > > > > sh 112544 malahal cwd DIR > >> > > > > > 0,67 > >> > > > > > 65536 45605209 /home/malahal (deleted) > >> > > > > > (10.120.154.42:/nfs/malahal-export/) > >> > > > > > > >> > > > > > Function nfs_prime_dcache() seems to invalidate the dcache > >> > > > > > entry > >> > > > > > if > >> > > > > > nfs_same_file() returns false. nfs_same_file() does seem to > >> > > > > > return > >> > > > > > false with the following change, if I read it correctly, if > >> > > > > > there > >> > > > > > is a > >> > > > > > file handle change. Can this be the source of my issue? It > >> > > > > > seems > >> > > > > > that > >> > > > > > the client should do this only if the file handle is NOT > >> > > > > > valid > >> > > > > > (e.g. > >> > > > > > if it gets ESTALE), right? > >> > > > > > > >> > > > > > The following commit seems to assume that the objects are > >> > > > > > different if > >> > > > > > they have different file handles! > >> > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > >> > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > >> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > >> > > > > > > >> > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > >> > > > > > >> > > > > My understanding is that for NFSv3 we have to assume that > >> > > > > distinct > >> > > > > filehandles are distinct objects, but maybe I'm wrong about > >> > > > > this. > >> > > > > > >> > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 > >> > > > > section 10.3.4 > >> > > > > to determine if the differing filehandles are the same object, > >> > > > > specifically > >> > > > > the fileid recommended attribute needs to be implemented. Is > >> > > > > Ganesha > >> > > > > returning the same fileid for both filehandles? > >> > > > > > >> > > > > Ben > >> > > -- > >> > > Trond Myklebust > >> > > CTO, Hammerspace Inc > >> > > 4300 El Camino Real, Suite 105 > >> > > Los Altos, CA 94022 > >> > > www.hammer.space > >> > > > >> > > > >> > > >> > > >> -- > >> Trond Myklebust > >> CTO, Hammerspace Inc > >> 4300 El Camino Real, Suite 105 > >> Los Altos, CA 94022 > >> www.hammer.space > >> > >> > [attachment "signature.asc" deleted by Marc Eshel/Almaden/IBM] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-04 19:47 ` Marc Eshel @ 2018-11-05 0:32 ` NeilBrown 2018-11-05 4:41 ` Malahal Naineni 2018-11-05 4:48 ` Marc Eshel 0 siblings, 2 replies; 24+ messages in thread From: NeilBrown @ 2018-11-05 0:32 UTC (permalink / raw) To: Marc Eshel Cc: bcodding, linux-nfs, linux-nfs-owner, malahal, mbenjami, Trond Myklebust [-- Attachment #1: Type: text/plain, Size: 8653 bytes --] On Sun, Nov 04 2018, Marc Eshel wrote: > linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 10:31:29 PM: > >> From: NeilBrown <neilb@suse.com> >> To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust > <trondmy@hammerspace.com> >> Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux-nfs >> \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs- >> owner@vger.kernel.org, "malahal\@gmail.com" <malahal@gmail.com>, >> "mbenjami\@redhat.com" <mbenjami@redhat.com> >> Date: 11/03/2018 10:41 PM >> Subject: Re: "(deleted)" directories >> Sent by: linux-nfs-owner@vger.kernel.org >> >> On Fri, Nov 02 2018, Marc Eshel wrote: >> >> > One reason to have different FHs for the same file is that a file can > be >> > linked from multiple directories. >> >> This has some based when considering filehandles for non-directories. >> However the original problem was with filehandles for directories..... > > This was just an example of why FH might be different, I don't think we > depend on it for the parent information anymore. Malahal listed some other > reasons for having different FH for the same file. I believe that Ganesha > split the FH to the key portion (the unique id of the file) and some other > information that is file system dependent. If the NFS client can not > handle the spec definition of FH maybe the spec should be updated to > something like Ganesha does. > Marc. Do we know exactly why the FH changed in this particular circumstance? Is there some way to find out? The NFSv3 spec has been updated - it is called "NFSv4" (now 4.2). It says a lot more things about filehandles, but even there, the spec is only as good as the what has been implemented and tested. I'm pretty sure that there are parts of the FH spec that have never been put into practice - so using them would not be wise (I'm particularly thinking of volatile file handles). For better or worse, Linux requires directories to have stable filehandles for NFSv3. This requirement is effectively imposed by the dcache. If there were some way to reliably check if two filehandles referred to the same directory, then we could relax that restriction, but I don't think there is. I think the other possible reason mentioned for changing the filehandle is to support migration. NFSv3 definitely doesn't support migration. NFSv4 explicitly tries to. NeilBrown > >> >> > Adding the parent inode to the FH help finding the the name of the > file by >> > looking for the file inode in >> > the parent directoy. >> > >> >> ....and directories have a ".." link, obviating the need to store parent >> information in the filehandle. >> >> NeilBrown >> >> >> > Marc. >> > >> > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: >> > >> >> From: Trond Myklebust <trondmy@hammerspace.com> >> >> To: "mbenjami@redhat.com" <mbenjami@redhat.com> >> >> Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com" >> >> <malahal@gmail.com>, "linux-nfs@vger.kernel.org" >> > <linux-nfs@vger.kernel.org> >> >> Date: 11/02/2018 05:15 PM >> >> Subject: Re: "(deleted)" directories >> >> Sent by: linux-nfs-owner@vger.kernel.org >> >> >> >> On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: >> >> > It sounds like a pretty good one, that goes to the heart of what a >> >> > specification is >> >> > >> >> >> >> While admittedly it is (still) Dia de los Muertos today, I would > think >> >> that someone who resurrected a part of the NFSv3 spec that has been >> >> unused for the full 23 years of its existence might have some >> >> explanation for why they did so? >> >> >> >> IOW: not being of a particularly religious persuasion, I usually want >> >> to understand why features are needed rather than having blind faith > in >> >> the person who wrote the spec. >> >> >> >> > Matt >> >> > >> >> > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < >> >> > trondmy@hammerspace.com> wrote: >> >> > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: >> >> > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the > same >> >> > > > server are equal, they must refer to the same file, but if >> >> > > > they are >> >> > > > not equal, no conclusions can be drawn." Ganesha does return > same >> >> > > > fileid here (inode). >> >> > > > >> >> > > > In NFSv4, they have introduced "unique_handles" attribute. I >> >> > > > don't >> >> > > > see >> >> > > > Linux NFS client using this at all though. >> >> > > >> >> > > Why does your server need to have multiple filehandles refer to > the >> >> > > same file, and why do you expect clients to support this? >> >> > > >> >> > > Yes, the spec allows it, but that's not a sufficient reason. >> >> > > >> >> > > > Regards, Malahal. >> >> > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < >> >> > > > bcodding@redhat.com> wrote: >> >> > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: >> >> > > > > >> >> > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. > The >> >> > > > > > client's >> >> > > > > > shell reports the following. Based on lsof, the directory > is >> >> > > > > > marked >> >> > > > > > deleted. "cd to ROOT and cd to the same home directory > fixes >> >> > > > > > the >> >> > > > > > issue. The client behaves as though the directory is > deleted >> >> > > > > > and >> >> > > > > > recreated! Our NFS-Ganesha server implementation uses >> >> > > > > > multiple >> >> > > > > > file >> >> > > > > > handles that point to the same object. NFS spec says this >> >> > > > > > should >> >> > > > > > be >> >> > > > > > fine, but Linux NFS seems to be broken in this regard. >> >> > > > > > tcpdump >> >> > > > > > does >> >> > > > > > indicate file handle change (note that all file handles are >> >> > > > > > permanent, >> >> > > > > > meaning they are valid at the server any time) around this >> >> > > > > > issue >> >> > > > > > time. >> >> > > > > > >> >> > > > > > "shell-init: error retrieving current directory: getcwd: >> >> > > > > > cannot >> >> > > > > > access >> >> > > > > > parent directories: No such file or directory" >> >> > > > > > sh 112544 malahal cwd DIR >> >> > > > > > 0,67 >> >> > > > > > 65536 45605209 /home/malahal (deleted) >> >> > > > > > (10.120.154.42:/nfs/malahal-export/) >> >> > > > > > >> >> > > > > > Function nfs_prime_dcache() seems to invalidate the dcache >> >> > > > > > entry >> >> > > > > > if >> >> > > > > > nfs_same_file() returns false. nfs_same_file() does seem to >> >> > > > > > return >> >> > > > > > false with the following change, if I read it correctly, if >> >> > > > > > there >> >> > > > > > is a >> >> > > > > > file handle change. Can this be the source of my issue? It >> >> > > > > > seems >> >> > > > > > that >> >> > > > > > the client should do this only if the file handle is NOT >> >> > > > > > valid >> >> > > > > > (e.g. >> >> > > > > > if it gets ESTALE), right? >> >> > > > > > >> >> > > > > > The following commit seems to assume that the objects are >> >> > > > > > different if >> >> > > > > > they have different file handles! >> >> > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 >> >> > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> >> >> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 >> >> > > > > > >> >> > > > > > NFS: Fix inode corruption in nfs_prime_dcache() >> >> > > > > >> >> > > > > My understanding is that for NFSv3 we have to assume that >> >> > > > > distinct >> >> > > > > filehandles are distinct objects, but maybe I'm wrong about >> >> > > > > this. >> >> > > > > >> >> > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 >> >> > > > > section 10.3.4 >> >> > > > > to determine if the differing filehandles are the same > object, >> >> > > > > specifically >> >> > > > > the fileid recommended attribute needs to be implemented. Is >> >> > > > > Ganesha >> >> > > > > returning the same fileid for both filehandles? >> >> > > > > >> >> > > > > Ben >> >> > > -- >> >> > > Trond Myklebust >> >> > > CTO, Hammerspace Inc >> >> > > 4300 El Camino Real, Suite 105 >> >> > > Los Altos, CA 94022 >> >> > > www.hammer.space >> >> > > >> >> > > >> >> > >> >> > >> >> -- >> >> Trond Myklebust >> >> CTO, Hammerspace Inc >> >> 4300 El Camino Real, Suite 105 >> >> Los Altos, CA 94022 >> >> www.hammer.space >> >> >> >> >> [attachment "signature.asc" deleted by Marc Eshel/Almaden/IBM] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-05 0:32 ` NeilBrown @ 2018-11-05 4:41 ` Malahal Naineni 2018-11-05 5:09 ` NeilBrown 2018-11-05 4:48 ` Marc Eshel 1 sibling, 1 reply; 24+ messages in thread From: Malahal Naineni @ 2018-11-05 4:41 UTC (permalink / raw) To: neilb; +Cc: eshel, bcodding, linux-nfs, linux-nfs-owner, mbenjami, trondmy > Do we know exactly why the FH changed in this particular circumstance? In this instance, this is due to a code bug but obviously, there are legitimate cases where this occur with Ganesha. > (I'm particularly thinking of volatile file handles). NFS4 RFC has "unique filehandles" concept as well. Linux NFS client doesn't seem to use "unique filehandles" attribute as well. On Mon, Nov 5, 2018 at 6:02 AM NeilBrown <neilb@suse.com> wrote: > > On Sun, Nov 04 2018, Marc Eshel wrote: > > > linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 10:31:29 PM: > > > >> From: NeilBrown <neilb@suse.com> > >> To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust > > <trondmy@hammerspace.com> > >> Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux-nfs > >> \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs- > >> owner@vger.kernel.org, "malahal\@gmail.com" <malahal@gmail.com>, > >> "mbenjami\@redhat.com" <mbenjami@redhat.com> > >> Date: 11/03/2018 10:41 PM > >> Subject: Re: "(deleted)" directories > >> Sent by: linux-nfs-owner@vger.kernel.org > >> > >> On Fri, Nov 02 2018, Marc Eshel wrote: > >> > >> > One reason to have different FHs for the same file is that a file can > > be > >> > linked from multiple directories. > >> > >> This has some based when considering filehandles for non-directories. > >> However the original problem was with filehandles for directories..... > > > > This was just an example of why FH might be different, I don't think we > > depend on it for the parent information anymore. Malahal listed some other > > reasons for having different FH for the same file. I believe that Ganesha > > split the FH to the key portion (the unique id of the file) and some other > > information that is file system dependent. If the NFS client can not > > handle the spec definition of FH maybe the spec should be updated to > > something like Ganesha does. > > Marc. > > Do we know exactly why the FH changed in this particular circumstance? > Is there some way to find out? > > The NFSv3 spec has been updated - it is called "NFSv4" (now 4.2). It > says a lot more things about filehandles, but even there, the spec is > only as good as the what has been implemented and tested. I'm pretty > sure that there are parts of the FH spec that have never been put into > practice - so using them would not be wise (I'm particularly thinking of > volatile file handles). > > For better or worse, Linux requires directories to have stable > filehandles for NFSv3. This requirement is effectively imposed by the > dcache. If there were some way to reliably check if two filehandles > referred to the same directory, then we could relax that restriction, > but I don't think there is. > > I think the other possible reason mentioned for changing the filehandle > is to support migration. NFSv3 definitely doesn't support migration. > NFSv4 explicitly tries to. > > NeilBrown > > > > > >> > >> > Adding the parent inode to the FH help finding the the name of the > > file by > >> > looking for the file inode in > >> > the parent directoy. > >> > > >> > >> ....and directories have a ".." link, obviating the need to store parent > >> information in the filehandle. > >> > >> NeilBrown > >> > >> > >> > Marc. > >> > > >> > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: > >> > > >> >> From: Trond Myklebust <trondmy@hammerspace.com> > >> >> To: "mbenjami@redhat.com" <mbenjami@redhat.com> > >> >> Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com" > >> >> <malahal@gmail.com>, "linux-nfs@vger.kernel.org" > >> > <linux-nfs@vger.kernel.org> > >> >> Date: 11/02/2018 05:15 PM > >> >> Subject: Re: "(deleted)" directories > >> >> Sent by: linux-nfs-owner@vger.kernel.org > >> >> > >> >> On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: > >> >> > It sounds like a pretty good one, that goes to the heart of what a > >> >> > specification is > >> >> > > >> >> > >> >> While admittedly it is (still) Dia de los Muertos today, I would > > think > >> >> that someone who resurrected a part of the NFSv3 spec that has been > >> >> unused for the full 23 years of its existence might have some > >> >> explanation for why they did so? > >> >> > >> >> IOW: not being of a particularly religious persuasion, I usually want > >> >> to understand why features are needed rather than having blind faith > > in > >> >> the person who wrote the spec. > >> >> > >> >> > Matt > >> >> > > >> >> > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > >> >> > trondmy@hammerspace.com> wrote: > >> >> > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > >> >> > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the > > same > >> >> > > > server are equal, they must refer to the same file, but if > >> >> > > > they are > >> >> > > > not equal, no conclusions can be drawn." Ganesha does return > > same > >> >> > > > fileid here (inode). > >> >> > > > > >> >> > > > In NFSv4, they have introduced "unique_handles" attribute. I > >> >> > > > don't > >> >> > > > see > >> >> > > > Linux NFS client using this at all though. > >> >> > > > >> >> > > Why does your server need to have multiple filehandles refer to > > the > >> >> > > same file, and why do you expect clients to support this? > >> >> > > > >> >> > > Yes, the spec allows it, but that's not a sufficient reason. > >> >> > > > >> >> > > > Regards, Malahal. > >> >> > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > >> >> > > > bcodding@redhat.com> wrote: > >> >> > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > >> >> > > > > > >> >> > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. > > The > >> >> > > > > > client's > >> >> > > > > > shell reports the following. Based on lsof, the directory > > is > >> >> > > > > > marked > >> >> > > > > > deleted. "cd to ROOT and cd to the same home directory > > fixes > >> >> > > > > > the > >> >> > > > > > issue. The client behaves as though the directory is > > deleted > >> >> > > > > > and > >> >> > > > > > recreated! Our NFS-Ganesha server implementation uses > >> >> > > > > > multiple > >> >> > > > > > file > >> >> > > > > > handles that point to the same object. NFS spec says this > >> >> > > > > > should > >> >> > > > > > be > >> >> > > > > > fine, but Linux NFS seems to be broken in this regard. > >> >> > > > > > tcpdump > >> >> > > > > > does > >> >> > > > > > indicate file handle change (note that all file handles are > >> >> > > > > > permanent, > >> >> > > > > > meaning they are valid at the server any time) around this > >> >> > > > > > issue > >> >> > > > > > time. > >> >> > > > > > > >> >> > > > > > "shell-init: error retrieving current directory: getcwd: > >> >> > > > > > cannot > >> >> > > > > > access > >> >> > > > > > parent directories: No such file or directory" > >> >> > > > > > sh 112544 malahal cwd DIR > >> >> > > > > > 0,67 > >> >> > > > > > 65536 45605209 /home/malahal (deleted) > >> >> > > > > > (10.120.154.42:/nfs/malahal-export/) > >> >> > > > > > > >> >> > > > > > Function nfs_prime_dcache() seems to invalidate the dcache > >> >> > > > > > entry > >> >> > > > > > if > >> >> > > > > > nfs_same_file() returns false. nfs_same_file() does seem to > >> >> > > > > > return > >> >> > > > > > false with the following change, if I read it correctly, if > >> >> > > > > > there > >> >> > > > > > is a > >> >> > > > > > file handle change. Can this be the source of my issue? It > >> >> > > > > > seems > >> >> > > > > > that > >> >> > > > > > the client should do this only if the file handle is NOT > >> >> > > > > > valid > >> >> > > > > > (e.g. > >> >> > > > > > if it gets ESTALE), right? > >> >> > > > > > > >> >> > > > > > The following commit seems to assume that the objects are > >> >> > > > > > different if > >> >> > > > > > they have different file handles! > >> >> > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > >> >> > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > >> >> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > >> >> > > > > > > >> >> > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > >> >> > > > > > >> >> > > > > My understanding is that for NFSv3 we have to assume that > >> >> > > > > distinct > >> >> > > > > filehandles are distinct objects, but maybe I'm wrong about > >> >> > > > > this. > >> >> > > > > > >> >> > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 > >> >> > > > > section 10.3.4 > >> >> > > > > to determine if the differing filehandles are the same > > object, > >> >> > > > > specifically > >> >> > > > > the fileid recommended attribute needs to be implemented. Is > >> >> > > > > Ganesha > >> >> > > > > returning the same fileid for both filehandles? > >> >> > > > > > >> >> > > > > Ben > >> >> > > -- > >> >> > > Trond Myklebust > >> >> > > CTO, Hammerspace Inc > >> >> > > 4300 El Camino Real, Suite 105 > >> >> > > Los Altos, CA 94022 > >> >> > > www.hammer.space > >> >> > > > >> >> > > > >> >> > > >> >> > > >> >> -- > >> >> Trond Myklebust > >> >> CTO, Hammerspace Inc > >> >> 4300 El Camino Real, Suite 105 > >> >> Los Altos, CA 94022 > >> >> www.hammer.space > >> >> > >> >> > >> [attachment "signature.asc" deleted by Marc Eshel/Almaden/IBM] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-05 4:41 ` Malahal Naineni @ 2018-11-05 5:09 ` NeilBrown 2018-11-05 7:40 ` Malahal Naineni 0 siblings, 1 reply; 24+ messages in thread From: NeilBrown @ 2018-11-05 5:09 UTC (permalink / raw) To: Malahal Naineni Cc: eshel, bcodding, linux-nfs, linux-nfs-owner, mbenjami, trondmy [-- Attachment #1: Type: text/plain, Size: 10184 bytes --] On Mon, Nov 05 2018, Malahal Naineni wrote: >> Do we know exactly why the FH changed in this particular circumstance? > > In this instance, this is due to a code bug but obviously, there are > legitimate cases where this occur with Ganesha. Good to know that bug has been found, and presumably fixed. It is not obvious to me that there are any such legitimate cases for directories. > >> (I'm particularly thinking of volatile file handles). > > NFS4 RFC has "unique filehandles" concept as well. Linux NFS client > doesn't seem to use "unique filehandles" attribute as well. A client doesn't need to use that attribute. My reading of section 10.3.4 of RFC7530 suggests that the client should generally compare fsid and fileid to see if two different filehandles refer to the same object or not. If unique_handles is known to be set for a given fsid, then different filehandles imply different files, without bothering to check the fileid. So the use of unique_handles is an optimization. I haven't looked at the Linux/NFS code to see if it conforms to 10.3.4. NeilBrown > > On Mon, Nov 5, 2018 at 6:02 AM NeilBrown <neilb@suse.com> wrote: >> >> On Sun, Nov 04 2018, Marc Eshel wrote: >> >> > linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 10:31:29 PM: >> > >> >> From: NeilBrown <neilb@suse.com> >> >> To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust >> > <trondmy@hammerspace.com> >> >> Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux-nfs >> >> \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs- >> >> owner@vger.kernel.org, "malahal\@gmail.com" <malahal@gmail.com>, >> >> "mbenjami\@redhat.com" <mbenjami@redhat.com> >> >> Date: 11/03/2018 10:41 PM >> >> Subject: Re: "(deleted)" directories >> >> Sent by: linux-nfs-owner@vger.kernel.org >> >> >> >> On Fri, Nov 02 2018, Marc Eshel wrote: >> >> >> >> > One reason to have different FHs for the same file is that a file can >> > be >> >> > linked from multiple directories. >> >> >> >> This has some based when considering filehandles for non-directories. >> >> However the original problem was with filehandles for directories..... >> > >> > This was just an example of why FH might be different, I don't think we >> > depend on it for the parent information anymore. Malahal listed some other >> > reasons for having different FH for the same file. I believe that Ganesha >> > split the FH to the key portion (the unique id of the file) and some other >> > information that is file system dependent. If the NFS client can not >> > handle the spec definition of FH maybe the spec should be updated to >> > something like Ganesha does. >> > Marc. >> >> Do we know exactly why the FH changed in this particular circumstance? >> Is there some way to find out? >> >> The NFSv3 spec has been updated - it is called "NFSv4" (now 4.2). It >> says a lot more things about filehandles, but even there, the spec is >> only as good as the what has been implemented and tested. I'm pretty >> sure that there are parts of the FH spec that have never been put into >> practice - so using them would not be wise (I'm particularly thinking of >> volatile file handles). >> >> For better or worse, Linux requires directories to have stable >> filehandles for NFSv3. This requirement is effectively imposed by the >> dcache. If there were some way to reliably check if two filehandles >> referred to the same directory, then we could relax that restriction, >> but I don't think there is. >> >> I think the other possible reason mentioned for changing the filehandle >> is to support migration. NFSv3 definitely doesn't support migration. >> NFSv4 explicitly tries to. >> >> NeilBrown >> >> >> > >> >> >> >> > Adding the parent inode to the FH help finding the the name of the >> > file by >> >> > looking for the file inode in >> >> > the parent directoy. >> >> > >> >> >> >> ....and directories have a ".." link, obviating the need to store parent >> >> information in the filehandle. >> >> >> >> NeilBrown >> >> >> >> >> >> > Marc. >> >> > >> >> > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: >> >> > >> >> >> From: Trond Myklebust <trondmy@hammerspace.com> >> >> >> To: "mbenjami@redhat.com" <mbenjami@redhat.com> >> >> >> Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com" >> >> >> <malahal@gmail.com>, "linux-nfs@vger.kernel.org" >> >> > <linux-nfs@vger.kernel.org> >> >> >> Date: 11/02/2018 05:15 PM >> >> >> Subject: Re: "(deleted)" directories >> >> >> Sent by: linux-nfs-owner@vger.kernel.org >> >> >> >> >> >> On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: >> >> >> > It sounds like a pretty good one, that goes to the heart of what a >> >> >> > specification is >> >> >> > >> >> >> >> >> >> While admittedly it is (still) Dia de los Muertos today, I would >> > think >> >> >> that someone who resurrected a part of the NFSv3 spec that has been >> >> >> unused for the full 23 years of its existence might have some >> >> >> explanation for why they did so? >> >> >> >> >> >> IOW: not being of a particularly religious persuasion, I usually want >> >> >> to understand why features are needed rather than having blind faith >> > in >> >> >> the person who wrote the spec. >> >> >> >> >> >> > Matt >> >> >> > >> >> >> > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < >> >> >> > trondmy@hammerspace.com> wrote: >> >> >> > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: >> >> >> > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the >> > same >> >> >> > > > server are equal, they must refer to the same file, but if >> >> >> > > > they are >> >> >> > > > not equal, no conclusions can be drawn." Ganesha does return >> > same >> >> >> > > > fileid here (inode). >> >> >> > > > >> >> >> > > > In NFSv4, they have introduced "unique_handles" attribute. I >> >> >> > > > don't >> >> >> > > > see >> >> >> > > > Linux NFS client using this at all though. >> >> >> > > >> >> >> > > Why does your server need to have multiple filehandles refer to >> > the >> >> >> > > same file, and why do you expect clients to support this? >> >> >> > > >> >> >> > > Yes, the spec allows it, but that's not a sufficient reason. >> >> >> > > >> >> >> > > > Regards, Malahal. >> >> >> > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < >> >> >> > > > bcodding@redhat.com> wrote: >> >> >> > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: >> >> >> > > > > >> >> >> > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. >> > The >> >> >> > > > > > client's >> >> >> > > > > > shell reports the following. Based on lsof, the directory >> > is >> >> >> > > > > > marked >> >> >> > > > > > deleted. "cd to ROOT and cd to the same home directory >> > fixes >> >> >> > > > > > the >> >> >> > > > > > issue. The client behaves as though the directory is >> > deleted >> >> >> > > > > > and >> >> >> > > > > > recreated! Our NFS-Ganesha server implementation uses >> >> >> > > > > > multiple >> >> >> > > > > > file >> >> >> > > > > > handles that point to the same object. NFS spec says this >> >> >> > > > > > should >> >> >> > > > > > be >> >> >> > > > > > fine, but Linux NFS seems to be broken in this regard. >> >> >> > > > > > tcpdump >> >> >> > > > > > does >> >> >> > > > > > indicate file handle change (note that all file handles are >> >> >> > > > > > permanent, >> >> >> > > > > > meaning they are valid at the server any time) around this >> >> >> > > > > > issue >> >> >> > > > > > time. >> >> >> > > > > > >> >> >> > > > > > "shell-init: error retrieving current directory: getcwd: >> >> >> > > > > > cannot >> >> >> > > > > > access >> >> >> > > > > > parent directories: No such file or directory" >> >> >> > > > > > sh 112544 malahal cwd DIR >> >> >> > > > > > 0,67 >> >> >> > > > > > 65536 45605209 /home/malahal (deleted) >> >> >> > > > > > (10.120.154.42:/nfs/malahal-export/) >> >> >> > > > > > >> >> >> > > > > > Function nfs_prime_dcache() seems to invalidate the dcache >> >> >> > > > > > entry >> >> >> > > > > > if >> >> >> > > > > > nfs_same_file() returns false. nfs_same_file() does seem to >> >> >> > > > > > return >> >> >> > > > > > false with the following change, if I read it correctly, if >> >> >> > > > > > there >> >> >> > > > > > is a >> >> >> > > > > > file handle change. Can this be the source of my issue? It >> >> >> > > > > > seems >> >> >> > > > > > that >> >> >> > > > > > the client should do this only if the file handle is NOT >> >> >> > > > > > valid >> >> >> > > > > > (e.g. >> >> >> > > > > > if it gets ESTALE), right? >> >> >> > > > > > >> >> >> > > > > > The following commit seems to assume that the objects are >> >> >> > > > > > different if >> >> >> > > > > > they have different file handles! >> >> >> > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 >> >> >> > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> >> >> >> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 >> >> >> > > > > > >> >> >> > > > > > NFS: Fix inode corruption in nfs_prime_dcache() >> >> >> > > > > >> >> >> > > > > My understanding is that for NFSv3 we have to assume that >> >> >> > > > > distinct >> >> >> > > > > filehandles are distinct objects, but maybe I'm wrong about >> >> >> > > > > this. >> >> >> > > > > >> >> >> > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 >> >> >> > > > > section 10.3.4 >> >> >> > > > > to determine if the differing filehandles are the same >> > object, >> >> >> > > > > specifically >> >> >> > > > > the fileid recommended attribute needs to be implemented. Is >> >> >> > > > > Ganesha >> >> >> > > > > returning the same fileid for both filehandles? >> >> >> > > > > >> >> >> > > > > Ben >> >> >> > > -- >> >> >> > > Trond Myklebust >> >> >> > > CTO, Hammerspace Inc >> >> >> > > 4300 El Camino Real, Suite 105 >> >> >> > > Los Altos, CA 94022 >> >> >> > > www.hammer.space >> >> >> > > >> >> >> > > >> >> >> > >> >> >> > >> >> >> -- >> >> >> Trond Myklebust >> >> >> CTO, Hammerspace Inc >> >> >> 4300 El Camino Real, Suite 105 >> >> >> Los Altos, CA 94022 >> >> >> www.hammer.space >> >> >> >> >> >> >> >> [attachment "signature.asc" deleted by Marc Eshel/Almaden/IBM] [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-05 5:09 ` NeilBrown @ 2018-11-05 7:40 ` Malahal Naineni 2018-11-05 13:45 ` Trond Myklebust 2018-11-06 8:43 ` Mkrtchyan, Tigran 0 siblings, 2 replies; 24+ messages in thread From: Malahal Naineni @ 2018-11-05 7:40 UTC (permalink / raw) To: neilb; +Cc: eshel, bcodding, linux-nfs, linux-nfs-owner, mbenjami, trondmy >> My reading of section 10.3.4 of RFC7530 suggests that the client should generally compare fsid and fileid to see if two different filehandles refer to the same object or not. Section 10.3.4 is for files only correct? The issue here is for directories. Also, Trond clearly pointed that Linux breaks section 10.3.4 from his email stating "We treat always different filehandles as if they refer to different files. It has long been the case that snapshots from several vendors are encoded to look like the same file (same fileid + same fsid) and differing only by filehandle. If we were to try to consolidate those inodes we would end up corrupting application data." We don't respect either NFSv3 or NFSv4 RFCs in this regard! Regards, Malahal. Regards, Malahal. On Mon, Nov 5, 2018 at 10:39 AM NeilBrown <neilb@suse.com> wrote: > > On Mon, Nov 05 2018, Malahal Naineni wrote: > > >> Do we know exactly why the FH changed in this particular circumstance? > > > > In this instance, this is due to a code bug but obviously, there are > > legitimate cases where this occur with Ganesha. > > Good to know that bug has been found, and presumably fixed. > It is not obvious to me that there are any such legitimate cases for > directories. > > > > >> (I'm particularly thinking of volatile file handles). > > > > NFS4 RFC has "unique filehandles" concept as well. Linux NFS client > > doesn't seem to use "unique filehandles" attribute as well. > > A client doesn't need to use that attribute. > My reading of section 10.3.4 of RFC7530 suggests that the client should > generally compare fsid and fileid to see if two different filehandles refer to > the same object or not. > If unique_handles is known to be set for a given fsid, then different > filehandles imply different files, without bothering to check the > fileid. > So the use of unique_handles is an optimization. > > I haven't looked at the Linux/NFS code to see if it conforms to 10.3.4. > > NeilBrown > > > > > > On Mon, Nov 5, 2018 at 6:02 AM NeilBrown <neilb@suse.com> wrote: > >> > >> On Sun, Nov 04 2018, Marc Eshel wrote: > >> > >> > linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 10:31:29 PM: > >> > > >> >> From: NeilBrown <neilb@suse.com> > >> >> To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust > >> > <trondmy@hammerspace.com> > >> >> Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux-nfs > >> >> \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs- > >> >> owner@vger.kernel.org, "malahal\@gmail.com" <malahal@gmail.com>, > >> >> "mbenjami\@redhat.com" <mbenjami@redhat.com> > >> >> Date: 11/03/2018 10:41 PM > >> >> Subject: Re: "(deleted)" directories > >> >> Sent by: linux-nfs-owner@vger.kernel.org > >> >> > >> >> On Fri, Nov 02 2018, Marc Eshel wrote: > >> >> > >> >> > One reason to have different FHs for the same file is that a file can > >> > be > >> >> > linked from multiple directories. > >> >> > >> >> This has some based when considering filehandles for non-directories. > >> >> However the original problem was with filehandles for directories..... > >> > > >> > This was just an example of why FH might be different, I don't think we > >> > depend on it for the parent information anymore. Malahal listed some other > >> > reasons for having different FH for the same file. I believe that Ganesha > >> > split the FH to the key portion (the unique id of the file) and some other > >> > information that is file system dependent. If the NFS client can not > >> > handle the spec definition of FH maybe the spec should be updated to > >> > something like Ganesha does. > >> > Marc. > >> > >> Do we know exactly why the FH changed in this particular circumstance? > >> Is there some way to find out? > >> > >> The NFSv3 spec has been updated - it is called "NFSv4" (now 4.2). It > >> says a lot more things about filehandles, but even there, the spec is > >> only as good as the what has been implemented and tested. I'm pretty > >> sure that there are parts of the FH spec that have never been put into > >> practice - so using them would not be wise (I'm particularly thinking of > >> volatile file handles). > >> > >> For better or worse, Linux requires directories to have stable > >> filehandles for NFSv3. This requirement is effectively imposed by the > >> dcache. If there were some way to reliably check if two filehandles > >> referred to the same directory, then we could relax that restriction, > >> but I don't think there is. > >> > >> I think the other possible reason mentioned for changing the filehandle > >> is to support migration. NFSv3 definitely doesn't support migration. > >> NFSv4 explicitly tries to. > >> > >> NeilBrown > >> > >> > >> > > >> >> > >> >> > Adding the parent inode to the FH help finding the the name of the > >> > file by > >> >> > looking for the file inode in > >> >> > the parent directoy. > >> >> > > >> >> > >> >> ....and directories have a ".." link, obviating the need to store parent > >> >> information in the filehandle. > >> >> > >> >> NeilBrown > >> >> > >> >> > >> >> > Marc. > >> >> > > >> >> > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: > >> >> > > >> >> >> From: Trond Myklebust <trondmy@hammerspace.com> > >> >> >> To: "mbenjami@redhat.com" <mbenjami@redhat.com> > >> >> >> Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com" > >> >> >> <malahal@gmail.com>, "linux-nfs@vger.kernel.org" > >> >> > <linux-nfs@vger.kernel.org> > >> >> >> Date: 11/02/2018 05:15 PM > >> >> >> Subject: Re: "(deleted)" directories > >> >> >> Sent by: linux-nfs-owner@vger.kernel.org > >> >> >> > >> >> >> On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: > >> >> >> > It sounds like a pretty good one, that goes to the heart of what a > >> >> >> > specification is > >> >> >> > > >> >> >> > >> >> >> While admittedly it is (still) Dia de los Muertos today, I would > >> > think > >> >> >> that someone who resurrected a part of the NFSv3 spec that has been > >> >> >> unused for the full 23 years of its existence might have some > >> >> >> explanation for why they did so? > >> >> >> > >> >> >> IOW: not being of a particularly religious persuasion, I usually want > >> >> >> to understand why features are needed rather than having blind faith > >> > in > >> >> >> the person who wrote the spec. > >> >> >> > >> >> >> > Matt > >> >> >> > > >> >> >> > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > >> >> >> > trondmy@hammerspace.com> wrote: > >> >> >> > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > >> >> >> > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the > >> > same > >> >> >> > > > server are equal, they must refer to the same file, but if > >> >> >> > > > they are > >> >> >> > > > not equal, no conclusions can be drawn." Ganesha does return > >> > same > >> >> >> > > > fileid here (inode). > >> >> >> > > > > >> >> >> > > > In NFSv4, they have introduced "unique_handles" attribute. I > >> >> >> > > > don't > >> >> >> > > > see > >> >> >> > > > Linux NFS client using this at all though. > >> >> >> > > > >> >> >> > > Why does your server need to have multiple filehandles refer to > >> > the > >> >> >> > > same file, and why do you expect clients to support this? > >> >> >> > > > >> >> >> > > Yes, the spec allows it, but that's not a sufficient reason. > >> >> >> > > > >> >> >> > > > Regards, Malahal. > >> >> >> > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > >> >> >> > > > bcodding@redhat.com> wrote: > >> >> >> > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > >> >> >> > > > > > >> >> >> > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. > >> > The > >> >> >> > > > > > client's > >> >> >> > > > > > shell reports the following. Based on lsof, the directory > >> > is > >> >> >> > > > > > marked > >> >> >> > > > > > deleted. "cd to ROOT and cd to the same home directory > >> > fixes > >> >> >> > > > > > the > >> >> >> > > > > > issue. The client behaves as though the directory is > >> > deleted > >> >> >> > > > > > and > >> >> >> > > > > > recreated! Our NFS-Ganesha server implementation uses > >> >> >> > > > > > multiple > >> >> >> > > > > > file > >> >> >> > > > > > handles that point to the same object. NFS spec says this > >> >> >> > > > > > should > >> >> >> > > > > > be > >> >> >> > > > > > fine, but Linux NFS seems to be broken in this regard. > >> >> >> > > > > > tcpdump > >> >> >> > > > > > does > >> >> >> > > > > > indicate file handle change (note that all file handles are > >> >> >> > > > > > permanent, > >> >> >> > > > > > meaning they are valid at the server any time) around this > >> >> >> > > > > > issue > >> >> >> > > > > > time. > >> >> >> > > > > > > >> >> >> > > > > > "shell-init: error retrieving current directory: getcwd: > >> >> >> > > > > > cannot > >> >> >> > > > > > access > >> >> >> > > > > > parent directories: No such file or directory" > >> >> >> > > > > > sh 112544 malahal cwd DIR > >> >> >> > > > > > 0,67 > >> >> >> > > > > > 65536 45605209 /home/malahal (deleted) > >> >> >> > > > > > (10.120.154.42:/nfs/malahal-export/) > >> >> >> > > > > > > >> >> >> > > > > > Function nfs_prime_dcache() seems to invalidate the dcache > >> >> >> > > > > > entry > >> >> >> > > > > > if > >> >> >> > > > > > nfs_same_file() returns false. nfs_same_file() does seem to > >> >> >> > > > > > return > >> >> >> > > > > > false with the following change, if I read it correctly, if > >> >> >> > > > > > there > >> >> >> > > > > > is a > >> >> >> > > > > > file handle change. Can this be the source of my issue? It > >> >> >> > > > > > seems > >> >> >> > > > > > that > >> >> >> > > > > > the client should do this only if the file handle is NOT > >> >> >> > > > > > valid > >> >> >> > > > > > (e.g. > >> >> >> > > > > > if it gets ESTALE), right? > >> >> >> > > > > > > >> >> >> > > > > > The following commit seems to assume that the objects are > >> >> >> > > > > > different if > >> >> >> > > > > > they have different file handles! > >> >> >> > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > >> >> >> > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > >> >> >> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > >> >> >> > > > > > > >> >> >> > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > >> >> >> > > > > > >> >> >> > > > > My understanding is that for NFSv3 we have to assume that > >> >> >> > > > > distinct > >> >> >> > > > > filehandles are distinct objects, but maybe I'm wrong about > >> >> >> > > > > this. > >> >> >> > > > > > >> >> >> > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 > >> >> >> > > > > section 10.3.4 > >> >> >> > > > > to determine if the differing filehandles are the same > >> > object, > >> >> >> > > > > specifically > >> >> >> > > > > the fileid recommended attribute needs to be implemented. Is > >> >> >> > > > > Ganesha > >> >> >> > > > > returning the same fileid for both filehandles? > >> >> >> > > > > > >> >> >> > > > > Ben > >> >> >> > > -- > >> >> >> > > Trond Myklebust > >> >> >> > > CTO, Hammerspace Inc > >> >> >> > > 4300 El Camino Real, Suite 105 > >> >> >> > > Los Altos, CA 94022 > >> >> >> > > www.hammer.space > >> >> >> > > > >> >> >> > > > >> >> >> > > >> >> >> > > >> >> >> -- > >> >> >> Trond Myklebust > >> >> >> CTO, Hammerspace Inc > >> >> >> 4300 El Camino Real, Suite 105 > >> >> >> Los Altos, CA 94022 > >> >> >> www.hammer.space > >> >> >> > >> >> >> > >> >> [attachment "signature.asc" deleted by Marc Eshel/Almaden/IBM] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-05 7:40 ` Malahal Naineni @ 2018-11-05 13:45 ` Trond Myklebust 2018-11-05 23:49 ` NeilBrown 2018-11-06 8:43 ` Mkrtchyan, Tigran 1 sibling, 1 reply; 24+ messages in thread From: Trond Myklebust @ 2018-11-05 13:45 UTC (permalink / raw) To: malahal, neilb; +Cc: bcodding, mbenjami, eshel, linux-nfs On Mon, 2018-11-05 at 13:10 +0530, Malahal Naineni wrote: > > > My reading of section 10.3.4 of RFC7530 suggests that the client > > > should > generally compare fsid and fileid to see if two different filehandles > refer to > the same object or not. Except that is wrong. As I already said in my previous email, there are servers out there in the field that are happy to serve up snapshots that have the exact same fsid and fileid as the original files. NetApp will for instance happily do this by default unless you explicitly configure it not to. > > Section 10.3.4 is for files only correct? The issue here is for > directories. Also, Trond clearly pointed that Linux breaks section > 10.3.4 from his email stating "We treat always different filehandles > as if they refer to different > files. It has long been the case that snapshots from several vendors > are encoded to look like the same file (same fileid + same fsid) and > differing only by filehandle. If we were to try to consolidate those > inodes we would end up corrupting application data." > > We don't respect either NFSv3 or NFSv4 RFCs in this regard! While RFC7530 does have section 10.3.4 that describes "a reliable method to determine whether two distinct filehandles represent distinct objects", as long as server vendors are shipping product that violates it, then that entire section is a moot point. BTW: Note also how the same section reminds server vendors that "For NFSv3 clients, the typical practice has been to assume for the purpose of caching that distinct filehandles represent distinct file system objects." However, even if a client were to follow Section 10.3.4, then Section 9.1.4 states that any open/lock/delegation stateid is associated with a _single filehandle_, and that the lock state it carries is not allowed to be consolidated per file or fileid. See also Section 9.11, which more explicitly describes how to treat the multiple filehandle case. So while NFSv4 theoretically allows for the behaviour you are asking for, it is not particularly practical to implement, and as I said, the entire Section 10.3.4 is undermined by existing server implementations. > Regards, Malahal. > > > Regards, Malahal. > On Mon, Nov 5, 2018 at 10:39 AM NeilBrown <neilb@suse.com> wrote: > > On Mon, Nov 05 2018, Malahal Naineni wrote: > > > > > > Do we know exactly why the FH changed in this particular > > > > circumstance? > > > > > > In this instance, this is due to a code bug but obviously, there > > > are > > > legitimate cases where this occur with Ganesha. > > > > Good to know that bug has been found, and presumably fixed. > > It is not obvious to me that there are any such legitimate cases > > for > > directories. > > > > > > (I'm particularly thinking of volatile file handles). > > > > > > NFS4 RFC has "unique filehandles" concept as well. Linux NFS > > > client > > > doesn't seem to use "unique filehandles" attribute as well. > > > > A client doesn't need to use that attribute. > > My reading of section 10.3.4 of RFC7530 suggests that the client > > should > > generally compare fsid and fileid to see if two different > > filehandles refer to > > the same object or not. > > If unique_handles is known to be set for a given fsid, then > > different > > filehandles imply different files, without bothering to check the > > fileid. > > So the use of unique_handles is an optimization. > > > > I haven't looked at the Linux/NFS code to see if it conforms to > > 10.3.4. > > > > NeilBrown > > > > > > > On Mon, Nov 5, 2018 at 6:02 AM NeilBrown <neilb@suse.com> wrote: > > > > On Sun, Nov 04 2018, Marc Eshel wrote: > > > > > > > > > linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 10:31:29 > > > > > PM: > > > > > > > > > > > From: NeilBrown <neilb@suse.com> > > > > > > To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust > > > > > <trondmy@hammerspace.com> > > > > > > Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux- > > > > > > nfs > > > > > > \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs- > > > > > > owner@vger.kernel.org, "malahal\@gmail.com" < > > > > > > malahal@gmail.com>, > > > > > > "mbenjami\@redhat.com" <mbenjami@redhat.com> > > > > > > Date: 11/03/2018 10:41 PM > > > > > > Subject: Re: "(deleted)" directories > > > > > > Sent by: linux-nfs-owner@vger.kernel.org > > > > > > > > > > > > On Fri, Nov 02 2018, Marc Eshel wrote: > > > > > > > > > > > > > One reason to have different FHs for the same file is > > > > > > > that a file can > > > > > be > > > > > > > linked from multiple directories. > > > > > > > > > > > > This has some based when considering filehandles for non- > > > > > > directories. > > > > > > However the original problem was with filehandles for > > > > > > directories..... > > > > > > > > > > This was just an example of why FH might be different, I > > > > > don't think we > > > > > depend on it for the parent information anymore. Malahal > > > > > listed some other > > > > > reasons for having different FH for the same file. I believe > > > > > that Ganesha > > > > > split the FH to the key portion (the unique id of the file) > > > > > and some other > > > > > information that is file system dependent. If the NFS client > > > > > can not > > > > > handle the spec definition of FH maybe the spec should be > > > > > updated to > > > > > something like Ganesha does. > > > > > Marc. > > > > > > > > Do we know exactly why the FH changed in this particular > > > > circumstance? > > > > Is there some way to find out? > > > > > > > > The NFSv3 spec has been updated - it is called "NFSv4" (now > > > > 4.2). It > > > > says a lot more things about filehandles, but even there, the > > > > spec is > > > > only as good as the what has been implemented and tested. I'm > > > > pretty > > > > sure that there are parts of the FH spec that have never been > > > > put into > > > > practice - so using them would not be wise (I'm particularly > > > > thinking of > > > > volatile file handles). > > > > > > > > For better or worse, Linux requires directories to have stable > > > > filehandles for NFSv3. This requirement is effectively imposed > > > > by the > > > > dcache. If there were some way to reliably check if two > > > > filehandles > > > > referred to the same directory, then we could relax that > > > > restriction, > > > > but I don't think there is. > > > > > > > > I think the other possible reason mentioned for changing the > > > > filehandle > > > > is to support migration. NFSv3 definitely doesn't support > > > > migration. > > > > NFSv4 explicitly tries to. > > > > > > > > NeilBrown > > > > > > > > > > > > > > > Adding the parent inode to the FH help finding the the > > > > > > > name of the > > > > > file by > > > > > > > looking for the file inode in > > > > > > > the parent directoy. > > > > > > > > > > > > > > > > > > > ....and directories have a ".." link, obviating the need to > > > > > > store parent > > > > > > information in the filehandle. > > > > > > > > > > > > NeilBrown > > > > > > > > > > > > > > > > > > > Marc. > > > > > > > > > > > > > > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 > > > > > > > 05:15:42 PM: > > > > > > > > > > > > > > > From: Trond Myklebust <trondmy@hammerspace.com> > > > > > > > > To: "mbenjami@redhat.com" <mbenjami@redhat.com> > > > > > > > > Cc: "bcodding@redhat.com" <bcodding@redhat.com>, " > > > > > > > > malahal@gmail.com" > > > > > > > > <malahal@gmail.com>, "linux-nfs@vger.kernel.org" > > > > > > > <linux-nfs@vger.kernel.org> > > > > > > > > Date: 11/02/2018 05:15 PM > > > > > > > > Subject: Re: "(deleted)" directories > > > > > > > > Sent by: linux-nfs-owner@vger.kernel.org > > > > > > > > > > > > > > > > On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: > > > > > > > > > It sounds like a pretty good one, that goes to the > > > > > > > > > heart of what a > > > > > > > > > specification is > > > > > > > > > > > > > > > > > > > > > > > > > While admittedly it is (still) Dia de los Muertos > > > > > > > > today, I would > > > > > think > > > > > > > > that someone who resurrected a part of the NFSv3 spec > > > > > > > > that has been > > > > > > > > unused for the full 23 years of its existence might > > > > > > > > have some > > > > > > > > explanation for why they did so? > > > > > > > > > > > > > > > > IOW: not being of a particularly religious persuasion, > > > > > > > > I usually want > > > > > > > > to understand why features are needed rather than > > > > > > > > having blind faith > > > > > in > > > > > > > > the person who wrote the spec. > > > > > > > > > > > > > > > > > Matt > > > > > > > > > > > > > > > > > > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > > > > > > > > > trondmy@hammerspace.com> wrote: > > > > > > > > > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni > > > > > > > > > > wrote: > > > > > > > > > > > Ben, NFSv3 RFC1813.txt states: "If two file > > > > > > > > > > > handles from the > > > > > same > > > > > > > > > > > server are equal, they must refer to the same > > > > > > > > > > > file, but if > > > > > > > > > > > they are > > > > > > > > > > > not equal, no conclusions can be drawn." Ganesha > > > > > > > > > > > does return > > > > > same > > > > > > > > > > > fileid here (inode). > > > > > > > > > > > > > > > > > > > > > > In NFSv4, they have introduced "unique_handles" > > > > > > > > > > > attribute. I > > > > > > > > > > > don't > > > > > > > > > > > see > > > > > > > > > > > Linux NFS client using this at all though. > > > > > > > > > > > > > > > > > > > > Why does your server need to have multiple > > > > > > > > > > filehandles refer to > > > > > the > > > > > > > > > > same file, and why do you expect clients to support > > > > > > > > > > this? > > > > > > > > > > > > > > > > > > > > Yes, the spec allows it, but that's not a > > > > > > > > > > sufficient reason. > > > > > > > > > > > > > > > > > > > > > Regards, Malahal. > > > > > > > > > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin > > > > > > > > > > > Coddington < > > > > > > > > > > > bcodding@redhat.com> wrote: > > > > > > > > > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hi All, we are using NFS-Ganesha with Linux > > > > > > > > > > > > > NFS clients. > > > > > The > > > > > > > > > > > > > client's > > > > > > > > > > > > > shell reports the following. Based on lsof, > > > > > > > > > > > > > the directory > > > > > is > > > > > > > > > > > > > marked > > > > > > > > > > > > > deleted. "cd to ROOT and cd to the same home > > > > > > > > > > > > > directory > > > > > fixes > > > > > > > > > > > > > the > > > > > > > > > > > > > issue. The client behaves as though the > > > > > > > > > > > > > directory is > > > > > deleted > > > > > > > > > > > > > and > > > > > > > > > > > > > recreated! Our NFS-Ganesha server > > > > > > > > > > > > > implementation uses > > > > > > > > > > > > > multiple > > > > > > > > > > > > > file > > > > > > > > > > > > > handles that point to the same object. NFS > > > > > > > > > > > > > spec says this > > > > > > > > > > > > > should > > > > > > > > > > > > > be > > > > > > > > > > > > > fine, but Linux NFS seems to be broken in > > > > > > > > > > > > > this regard. > > > > > > > > > > > > > tcpdump > > > > > > > > > > > > > does > > > > > > > > > > > > > indicate file handle change (note that all > > > > > > > > > > > > > file handles are > > > > > > > > > > > > > permanent, > > > > > > > > > > > > > meaning they are valid at the server any > > > > > > > > > > > > > time) around this > > > > > > > > > > > > > issue > > > > > > > > > > > > > time. > > > > > > > > > > > > > > > > > > > > > > > > > > "shell-init: error retrieving current > > > > > > > > > > > > > directory: getcwd: > > > > > > > > > > > > > cannot > > > > > > > > > > > > > access > > > > > > > > > > > > > parent directories: No such file or > > > > > > > > > > > > > directory" > > > > > > > > > > > > > sh 112544 malahal cwd > > > > > > > > > > > > > DIR > > > > > > > > > > > > > 0,67 > > > > > > > > > > > > > 65536 45605209 /home/malahal (deleted) > > > > > > > > > > > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > > > > > > > > > > > > > > > > > > > > > Function nfs_prime_dcache() seems to > > > > > > > > > > > > > invalidate the dcache > > > > > > > > > > > > > entry > > > > > > > > > > > > > if > > > > > > > > > > > > > nfs_same_file() returns false. > > > > > > > > > > > > > nfs_same_file() does seem to > > > > > > > > > > > > > return > > > > > > > > > > > > > false with the following change, if I read it > > > > > > > > > > > > > correctly, if > > > > > > > > > > > > > there > > > > > > > > > > > > > is a > > > > > > > > > > > > > file handle change. Can this be the source of > > > > > > > > > > > > > my issue? It > > > > > > > > > > > > > seems > > > > > > > > > > > > > that > > > > > > > > > > > > > the client should do this only if the file > > > > > > > > > > > > > handle is NOT > > > > > > > > > > > > > valid > > > > > > > > > > > > > (e.g. > > > > > > > > > > > > > if it gets ESTALE), right? > > > > > > > > > > > > > > > > > > > > > > > > > > The following commit seems to assume that the > > > > > > > > > > > > > objects are > > > > > > > > > > > > > different if > > > > > > > > > > > > > they have different file handles! > > > > > > > > > > > > > commit > > > > > > > > > > > > > 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > > > > > > > > > > > Author: Trond Myklebust < > > > > > > > > > > > > > trond.myklebust@primarydata.com> > > > > > > > > > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > > > > > > > > > > > > > > > > > > > > > NFS: Fix inode corruption in > > > > > > > > > > > > > nfs_prime_dcache() > > > > > > > > > > > > > > > > > > > > > > > > My understanding is that for NFSv3 we have to > > > > > > > > > > > > assume that > > > > > > > > > > > > distinct > > > > > > > > > > > > filehandles are distinct objects, but maybe I'm > > > > > > > > > > > > wrong about > > > > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > For NFSv4.x, we can follow the guidance in RFCs > > > > > > > > > > > > 5661 or 7530 > > > > > > > > > > > > section 10.3.4 > > > > > > > > > > > > to determine if the differing filehandles are > > > > > > > > > > > > the same > > > > > object, > > > > > > > > > > > > specifically > > > > > > > > > > > > the fileid recommended attribute needs to be > > > > > > > > > > > > implemented. Is > > > > > > > > > > > > Ganesha > > > > > > > > > > > > returning the same fileid for both filehandles? > > > > > > > > > > > > > > > > > > > > > > > > Ben > > > > > > > > > > -- > > > > > > > > > > Trond Myklebust > > > > > > > > > > CTO, Hammerspace Inc > > > > > > > > > > 4300 El Camino Real, Suite 105 > > > > > > > > > > Los Altos, CA 94022 > > > > > > > > > > www.hammer.space > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Trond Myklebust > > > > > > > > CTO, Hammerspace Inc > > > > > > > > 4300 El Camino Real, Suite 105 > > > > > > > > Los Altos, CA 94022 > > > > > > > > www.hammer.space > > > > > > > > > > > > > > > > > > > > > > [attachment "signature.asc" deleted by Marc > > > > > > Eshel/Almaden/IBM] Trond Myklebust CTO, Hammerspace Inc 4300 El Camino Real, Suite 105 Los Altos, CA 94022 www.hammer.space -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-05 13:45 ` Trond Myklebust @ 2018-11-05 23:49 ` NeilBrown 2018-11-06 0:37 ` Trond Myklebust 0 siblings, 1 reply; 24+ messages in thread From: NeilBrown @ 2018-11-05 23:49 UTC (permalink / raw) To: Trond Myklebust, malahal; +Cc: bcodding, mbenjami, eshel, linux-nfs [-- Attachment #1: Type: text/plain, Size: 17217 bytes --] On Mon, Nov 05 2018, Trond Myklebust wrote: > On Mon, 2018-11-05 at 13:10 +0530, Malahal Naineni wrote: >> > > My reading of section 10.3.4 of RFC7530 suggests that the client >> > > should >> generally compare fsid and fileid to see if two different filehandles >> refer to >> the same object or not. > > Except that is wrong. As I already said in my previous email, there are > servers out there in the field that are happy to serve up snapshots > that have the exact same fsid and fileid as the original files. NetApp > will for instance happily do this by default unless you explicitly > configure it not to. > >> >> Section 10.3.4 is for files only correct? The issue here is for >> directories. Also, Trond clearly pointed that Linux breaks section >> 10.3.4 from his email stating "We treat always different filehandles >> as if they refer to different >> files. It has long been the case that snapshots from several vendors >> are encoded to look like the same file (same fileid + same fsid) and >> differing only by filehandle. If we were to try to consolidate those >> inodes we would end up corrupting application data." >> >> We don't respect either NFSv3 or NFSv4 RFCs in this regard! > > While RFC7530 does have section 10.3.4 that describes "a reliable > method to determine whether two distinct filehandles represent distinct > objects", as long as server vendors are shipping product that violates > it, then that entire section is a moot point. Ignoring the spec in order to support broken servers wouldn't be my first choice, but you do have a point. However, this is only an issue (as far as I know) in a specific circumstance that would not (I think) affect those servers. If we do a lookup of a name that we already have in the dcache, and we get a filehandle which is different from the cached inode, but has the same fsid/fileid as the cached inode, then it isn't going to be the same file in a different snapshot. In that case it might be reasonable to treat it as the same file, at least when it is a directory. i.e. same ( fsid, fileid, type, name) means same object. Maybe that would be too messy to implement, but it seems to be a possible balance between compliance and safety. It should stop directories from becoming "(deleted)" but shouldn't risk data corruption. Thanks, NeilBrown > > BTW: Note also how the same section reminds server vendors that "For > NFSv3 clients, the typical practice has been to assume for the purpose > of caching that distinct filehandles represent distinct file system > objects." > > However, even if a client were to follow Section 10.3.4, then Section > 9.1.4 states that any open/lock/delegation stateid is associated with a > _single filehandle_, and that the lock state it carries is not allowed > to be consolidated per file or fileid. See also Section 9.11, which > more explicitly describes how to treat the multiple filehandle case. > > So while NFSv4 theoretically allows for the behaviour you are asking > for, it is not particularly practical to implement, and as I said, the > entire Section 10.3.4 is undermined by existing server implementations. > >> Regards, Malahal. >> >> >> Regards, Malahal. >> On Mon, Nov 5, 2018 at 10:39 AM NeilBrown <neilb@suse.com> wrote: >> > On Mon, Nov 05 2018, Malahal Naineni wrote: >> > >> > > > Do we know exactly why the FH changed in this particular >> > > > circumstance? >> > > >> > > In this instance, this is due to a code bug but obviously, there >> > > are >> > > legitimate cases where this occur with Ganesha. >> > >> > Good to know that bug has been found, and presumably fixed. >> > It is not obvious to me that there are any such legitimate cases >> > for >> > directories. >> > >> > > > (I'm particularly thinking of volatile file handles). >> > > >> > > NFS4 RFC has "unique filehandles" concept as well. Linux NFS >> > > client >> > > doesn't seem to use "unique filehandles" attribute as well. >> > >> > A client doesn't need to use that attribute. >> > My reading of section 10.3.4 of RFC7530 suggests that the client >> > should >> > generally compare fsid and fileid to see if two different >> > filehandles refer to >> > the same object or not. >> > If unique_handles is known to be set for a given fsid, then >> > different >> > filehandles imply different files, without bothering to check the >> > fileid. >> > So the use of unique_handles is an optimization. >> > >> > I haven't looked at the Linux/NFS code to see if it conforms to >> > 10.3.4. >> > >> > NeilBrown >> > >> > >> > > On Mon, Nov 5, 2018 at 6:02 AM NeilBrown <neilb@suse.com> wrote: >> > > > On Sun, Nov 04 2018, Marc Eshel wrote: >> > > > >> > > > > linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 10:31:29 >> > > > > PM: >> > > > > >> > > > > > From: NeilBrown <neilb@suse.com> >> > > > > > To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust >> > > > > <trondmy@hammerspace.com> >> > > > > > Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux- >> > > > > > nfs >> > > > > > \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs- >> > > > > > owner@vger.kernel.org, "malahal\@gmail.com" < >> > > > > > malahal@gmail.com>, >> > > > > > "mbenjami\@redhat.com" <mbenjami@redhat.com> >> > > > > > Date: 11/03/2018 10:41 PM >> > > > > > Subject: Re: "(deleted)" directories >> > > > > > Sent by: linux-nfs-owner@vger.kernel.org >> > > > > > >> > > > > > On Fri, Nov 02 2018, Marc Eshel wrote: >> > > > > > >> > > > > > > One reason to have different FHs for the same file is >> > > > > > > that a file can >> > > > > be >> > > > > > > linked from multiple directories. >> > > > > > >> > > > > > This has some based when considering filehandles for non- >> > > > > > directories. >> > > > > > However the original problem was with filehandles for >> > > > > > directories..... >> > > > > >> > > > > This was just an example of why FH might be different, I >> > > > > don't think we >> > > > > depend on it for the parent information anymore. Malahal >> > > > > listed some other >> > > > > reasons for having different FH for the same file. I believe >> > > > > that Ganesha >> > > > > split the FH to the key portion (the unique id of the file) >> > > > > and some other >> > > > > information that is file system dependent. If the NFS client >> > > > > can not >> > > > > handle the spec definition of FH maybe the spec should be >> > > > > updated to >> > > > > something like Ganesha does. >> > > > > Marc. >> > > > >> > > > Do we know exactly why the FH changed in this particular >> > > > circumstance? >> > > > Is there some way to find out? >> > > > >> > > > The NFSv3 spec has been updated - it is called "NFSv4" (now >> > > > 4.2). It >> > > > says a lot more things about filehandles, but even there, the >> > > > spec is >> > > > only as good as the what has been implemented and tested. I'm >> > > > pretty >> > > > sure that there are parts of the FH spec that have never been >> > > > put into >> > > > practice - so using them would not be wise (I'm particularly >> > > > thinking of >> > > > volatile file handles). >> > > > >> > > > For better or worse, Linux requires directories to have stable >> > > > filehandles for NFSv3. This requirement is effectively imposed >> > > > by the >> > > > dcache. If there were some way to reliably check if two >> > > > filehandles >> > > > referred to the same directory, then we could relax that >> > > > restriction, >> > > > but I don't think there is. >> > > > >> > > > I think the other possible reason mentioned for changing the >> > > > filehandle >> > > > is to support migration. NFSv3 definitely doesn't support >> > > > migration. >> > > > NFSv4 explicitly tries to. >> > > > >> > > > NeilBrown >> > > > >> > > > >> > > > > > > Adding the parent inode to the FH help finding the the >> > > > > > > name of the >> > > > > file by >> > > > > > > looking for the file inode in >> > > > > > > the parent directoy. >> > > > > > > >> > > > > > >> > > > > > ....and directories have a ".." link, obviating the need to >> > > > > > store parent >> > > > > > information in the filehandle. >> > > > > > >> > > > > > NeilBrown >> > > > > > >> > > > > > >> > > > > > > Marc. >> > > > > > > >> > > > > > > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 >> > > > > > > 05:15:42 PM: >> > > > > > > >> > > > > > > > From: Trond Myklebust <trondmy@hammerspace.com> >> > > > > > > > To: "mbenjami@redhat.com" <mbenjami@redhat.com> >> > > > > > > > Cc: "bcodding@redhat.com" <bcodding@redhat.com>, " >> > > > > > > > malahal@gmail.com" >> > > > > > > > <malahal@gmail.com>, "linux-nfs@vger.kernel.org" >> > > > > > > <linux-nfs@vger.kernel.org> >> > > > > > > > Date: 11/02/2018 05:15 PM >> > > > > > > > Subject: Re: "(deleted)" directories >> > > > > > > > Sent by: linux-nfs-owner@vger.kernel.org >> > > > > > > > >> > > > > > > > On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: >> > > > > > > > > It sounds like a pretty good one, that goes to the >> > > > > > > > > heart of what a >> > > > > > > > > specification is >> > > > > > > > > >> > > > > > > > >> > > > > > > > While admittedly it is (still) Dia de los Muertos >> > > > > > > > today, I would >> > > > > think >> > > > > > > > that someone who resurrected a part of the NFSv3 spec >> > > > > > > > that has been >> > > > > > > > unused for the full 23 years of its existence might >> > > > > > > > have some >> > > > > > > > explanation for why they did so? >> > > > > > > > >> > > > > > > > IOW: not being of a particularly religious persuasion, >> > > > > > > > I usually want >> > > > > > > > to understand why features are needed rather than >> > > > > > > > having blind faith >> > > > > in >> > > > > > > > the person who wrote the spec. >> > > > > > > > >> > > > > > > > > Matt >> > > > > > > > > >> > > > > > > > > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < >> > > > > > > > > trondmy@hammerspace.com> wrote: >> > > > > > > > > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni >> > > > > > > > > > wrote: >> > > > > > > > > > > Ben, NFSv3 RFC1813.txt states: "If two file >> > > > > > > > > > > handles from the >> > > > > same >> > > > > > > > > > > server are equal, they must refer to the same >> > > > > > > > > > > file, but if >> > > > > > > > > > > they are >> > > > > > > > > > > not equal, no conclusions can be drawn." Ganesha >> > > > > > > > > > > does return >> > > > > same >> > > > > > > > > > > fileid here (inode). >> > > > > > > > > > > >> > > > > > > > > > > In NFSv4, they have introduced "unique_handles" >> > > > > > > > > > > attribute. I >> > > > > > > > > > > don't >> > > > > > > > > > > see >> > > > > > > > > > > Linux NFS client using this at all though. >> > > > > > > > > > >> > > > > > > > > > Why does your server need to have multiple >> > > > > > > > > > filehandles refer to >> > > > > the >> > > > > > > > > > same file, and why do you expect clients to support >> > > > > > > > > > this? >> > > > > > > > > > >> > > > > > > > > > Yes, the spec allows it, but that's not a >> > > > > > > > > > sufficient reason. >> > > > > > > > > > >> > > > > > > > > > > Regards, Malahal. >> > > > > > > > > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin >> > > > > > > > > > > Coddington < >> > > > > > > > > > > bcodding@redhat.com> wrote: >> > > > > > > > > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: >> > > > > > > > > > > > >> > > > > > > > > > > > > Hi All, we are using NFS-Ganesha with Linux >> > > > > > > > > > > > > NFS clients. >> > > > > The >> > > > > > > > > > > > > client's >> > > > > > > > > > > > > shell reports the following. Based on lsof, >> > > > > > > > > > > > > the directory >> > > > > is >> > > > > > > > > > > > > marked >> > > > > > > > > > > > > deleted. "cd to ROOT and cd to the same home >> > > > > > > > > > > > > directory >> > > > > fixes >> > > > > > > > > > > > > the >> > > > > > > > > > > > > issue. The client behaves as though the >> > > > > > > > > > > > > directory is >> > > > > deleted >> > > > > > > > > > > > > and >> > > > > > > > > > > > > recreated! Our NFS-Ganesha server >> > > > > > > > > > > > > implementation uses >> > > > > > > > > > > > > multiple >> > > > > > > > > > > > > file >> > > > > > > > > > > > > handles that point to the same object. NFS >> > > > > > > > > > > > > spec says this >> > > > > > > > > > > > > should >> > > > > > > > > > > > > be >> > > > > > > > > > > > > fine, but Linux NFS seems to be broken in >> > > > > > > > > > > > > this regard. >> > > > > > > > > > > > > tcpdump >> > > > > > > > > > > > > does >> > > > > > > > > > > > > indicate file handle change (note that all >> > > > > > > > > > > > > file handles are >> > > > > > > > > > > > > permanent, >> > > > > > > > > > > > > meaning they are valid at the server any >> > > > > > > > > > > > > time) around this >> > > > > > > > > > > > > issue >> > > > > > > > > > > > > time. >> > > > > > > > > > > > > >> > > > > > > > > > > > > "shell-init: error retrieving current >> > > > > > > > > > > > > directory: getcwd: >> > > > > > > > > > > > > cannot >> > > > > > > > > > > > > access >> > > > > > > > > > > > > parent directories: No such file or >> > > > > > > > > > > > > directory" >> > > > > > > > > > > > > sh 112544 malahal cwd >> > > > > > > > > > > > > DIR >> > > > > > > > > > > > > 0,67 >> > > > > > > > > > > > > 65536 45605209 /home/malahal (deleted) >> > > > > > > > > > > > > (10.120.154.42:/nfs/malahal-export/) >> > > > > > > > > > > > > >> > > > > > > > > > > > > Function nfs_prime_dcache() seems to >> > > > > > > > > > > > > invalidate the dcache >> > > > > > > > > > > > > entry >> > > > > > > > > > > > > if >> > > > > > > > > > > > > nfs_same_file() returns false. >> > > > > > > > > > > > > nfs_same_file() does seem to >> > > > > > > > > > > > > return >> > > > > > > > > > > > > false with the following change, if I read it >> > > > > > > > > > > > > correctly, if >> > > > > > > > > > > > > there >> > > > > > > > > > > > > is a >> > > > > > > > > > > > > file handle change. Can this be the source of >> > > > > > > > > > > > > my issue? It >> > > > > > > > > > > > > seems >> > > > > > > > > > > > > that >> > > > > > > > > > > > > the client should do this only if the file >> > > > > > > > > > > > > handle is NOT >> > > > > > > > > > > > > valid >> > > > > > > > > > > > > (e.g. >> > > > > > > > > > > > > if it gets ESTALE), right? >> > > > > > > > > > > > > >> > > > > > > > > > > > > The following commit seems to assume that the >> > > > > > > > > > > > > objects are >> > > > > > > > > > > > > different if >> > > > > > > > > > > > > they have different file handles! >> > > > > > > > > > > > > commit >> > > > > > > > > > > > > 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 >> > > > > > > > > > > > > Author: Trond Myklebust < >> > > > > > > > > > > > > trond.myklebust@primarydata.com> >> > > > > > > > > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 >> > > > > > > > > > > > > >> > > > > > > > > > > > > NFS: Fix inode corruption in >> > > > > > > > > > > > > nfs_prime_dcache() >> > > > > > > > > > > > >> > > > > > > > > > > > My understanding is that for NFSv3 we have to >> > > > > > > > > > > > assume that >> > > > > > > > > > > > distinct >> > > > > > > > > > > > filehandles are distinct objects, but maybe I'm >> > > > > > > > > > > > wrong about >> > > > > > > > > > > > this. >> > > > > > > > > > > > >> > > > > > > > > > > > For NFSv4.x, we can follow the guidance in RFCs >> > > > > > > > > > > > 5661 or 7530 >> > > > > > > > > > > > section 10.3.4 >> > > > > > > > > > > > to determine if the differing filehandles are >> > > > > > > > > > > > the same >> > > > > object, >> > > > > > > > > > > > specifically >> > > > > > > > > > > > the fileid recommended attribute needs to be >> > > > > > > > > > > > implemented. Is >> > > > > > > > > > > > Ganesha >> > > > > > > > > > > > returning the same fileid for both filehandles? >> > > > > > > > > > > > >> > > > > > > > > > > > Ben >> > > > > > > > > > -- >> > > > > > > > > > Trond Myklebust >> > > > > > > > > > CTO, Hammerspace Inc >> > > > > > > > > > 4300 El Camino Real, Suite 105 >> > > > > > > > > > Los Altos, CA 94022 >> > > > > > > > > > www.hammer.space >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > -- >> > > > > > > > Trond Myklebust >> > > > > > > > CTO, Hammerspace Inc >> > > > > > > > 4300 El Camino Real, Suite 105 >> > > > > > > > Los Altos, CA 94022 >> > > > > > > > www.hammer.space >> > > > > > > > >> > > > > > > > >> > > > > > [attachment "signature.asc" deleted by Marc >> > > > > > Eshel/Almaden/IBM] > Trond Myklebust > CTO, Hammerspace Inc > 4300 El Camino Real, Suite 105 > Los Altos, CA 94022 > www.hammer.space > > -- > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-05 23:49 ` NeilBrown @ 2018-11-06 0:37 ` Trond Myklebust 0 siblings, 0 replies; 24+ messages in thread From: Trond Myklebust @ 2018-11-06 0:37 UTC (permalink / raw) To: neilb, malahal; +Cc: bcodding, mbenjami, eshel, linux-nfs On Tue, 2018-11-06 at 10:49 +1100, NeilBrown wrote: > On Mon, Nov 05 2018, Trond Myklebust wrote: > > > On Mon, 2018-11-05 at 13:10 +0530, Malahal Naineni wrote: > > > > > My reading of section 10.3.4 of RFC7530 suggests that the > > > > > client > > > > > should > > > generally compare fsid and fileid to see if two different > > > filehandles > > > refer to > > > the same object or not. > > > > Except that is wrong. As I already said in my previous email, there > > are > > servers out there in the field that are happy to serve up snapshots > > that have the exact same fsid and fileid as the original files. > > NetApp > > will for instance happily do this by default unless you explicitly > > configure it not to. > > > > > Section 10.3.4 is for files only correct? The issue here is for > > > directories. Also, Trond clearly pointed that Linux breaks > > > section > > > 10.3.4 from his email stating "We treat always different > > > filehandles > > > as if they refer to different > > > files. It has long been the case that snapshots from several > > > vendors > > > are encoded to look like the same file (same fileid + same fsid) > > > and > > > differing only by filehandle. If we were to try to consolidate > > > those > > > inodes we would end up corrupting application data." > > > > > > We don't respect either NFSv3 or NFSv4 RFCs in this regard! > > > > While RFC7530 does have section 10.3.4 that describes "a reliable > > method to determine whether two distinct filehandles represent > > distinct > > objects", as long as server vendors are shipping product that > > violates > > it, then that entire section is a moot point. > > Ignoring the spec in order to support broken servers wouldn't be my > first choice, but you do have a point. > However, this is only an issue (as far as I know) in a specific > circumstance that would not (I think) affect those servers. It's about not breaking existing setups that have been working like this for > 20 years. > If we do a lookup of a name that we already have in the dcache, and > we > get a filehandle which is different from the cached inode, but has > the > same fsid/fileid as the cached inode, then it isn't going to be the > same > file in a different snapshot. In that case it might be reasonable to > treat it as the same file, at least when it is a directory. > i.e. same ( fsid, fileid, type, name) means same object. That's not the case for NetApp afaik. It can mean "snapshot of same object". The reason why this is allowed, and often preferred by sysadmins is that doing an "under the cover mount" in order to satisfy a fsid change is generally not scalable if you have several .snapshot entries per directory. > Maybe that would be too messy to implement, but it seems to be a > possible balance between compliance and safety. It should stop > directories from becoming "(deleted)" but shouldn't risk data > corruption. > > > Thanks, > NeilBrown > > > > BTW: Note also how the same section reminds server vendors that > > "For > > NFSv3 clients, the typical practice has been to assume for the > > purpose > > of caching that distinct filehandles represent distinct file system > > objects." > > > > However, even if a client were to follow Section 10.3.4, then > > Section > > 9.1.4 states that any open/lock/delegation stateid is associated > > with a > > _single filehandle_, and that the lock state it carries is not > > allowed > > to be consolidated per file or fileid. See also Section 9.11, which > > more explicitly describes how to treat the multiple filehandle > > case. > > > > So while NFSv4 theoretically allows for the behaviour you are > > asking > > for, it is not particularly practical to implement, and as I said, > > the > > entire Section 10.3.4 is undermined by existing server > > implementations. > > > > > Regards, Malahal. > > > > > > > > > Regards, Malahal. > > > On Mon, Nov 5, 2018 at 10:39 AM NeilBrown <neilb@suse.com> wrote: > > > > On Mon, Nov 05 2018, Malahal Naineni wrote: > > > > > > > > > > Do we know exactly why the FH changed in this particular > > > > > > circumstance? > > > > > > > > > > In this instance, this is due to a code bug but obviously, > > > > > there > > > > > are > > > > > legitimate cases where this occur with Ganesha. > > > > > > > > Good to know that bug has been found, and presumably fixed. > > > > It is not obvious to me that there are any such legitimate > > > > cases > > > > for > > > > directories. > > > > > > > > > > (I'm particularly thinking of volatile file handles). > > > > > > > > > > NFS4 RFC has "unique filehandles" concept as well. Linux NFS > > > > > client > > > > > doesn't seem to use "unique filehandles" attribute as well. > > > > > > > > A client doesn't need to use that attribute. > > > > My reading of section 10.3.4 of RFC7530 suggests that the > > > > client > > > > should > > > > generally compare fsid and fileid to see if two different > > > > filehandles refer to > > > > the same object or not. > > > > If unique_handles is known to be set for a given fsid, then > > > > different > > > > filehandles imply different files, without bothering to check > > > > the > > > > fileid. > > > > So the use of unique_handles is an optimization. > > > > > > > > I haven't looked at the Linux/NFS code to see if it conforms to > > > > 10.3.4. > > > > > > > > NeilBrown > > > > > > > > > > > > > On Mon, Nov 5, 2018 at 6:02 AM NeilBrown <neilb@suse.com> > > > > > wrote: > > > > > > On Sun, Nov 04 2018, Marc Eshel wrote: > > > > > > > > > > > > > linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 > > > > > > > 10:31:29 > > > > > > > PM: > > > > > > > > > > > > > > > From: NeilBrown <neilb@suse.com> > > > > > > > > To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust > > > > > > > <trondmy@hammerspace.com> > > > > > > > > Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, > > > > > > > > "linux- > > > > > > > > nfs > > > > > > > > \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux- > > > > > > > > nfs- > > > > > > > > owner@vger.kernel.org, "malahal\@gmail.com" < > > > > > > > > malahal@gmail.com>, > > > > > > > > "mbenjami\@redhat.com" <mbenjami@redhat.com> > > > > > > > > Date: 11/03/2018 10:41 PM > > > > > > > > Subject: Re: "(deleted)" directories > > > > > > > > Sent by: linux-nfs-owner@vger.kernel.org > > > > > > > > > > > > > > > > On Fri, Nov 02 2018, Marc Eshel wrote: > > > > > > > > > > > > > > > > > One reason to have different FHs for the same file is > > > > > > > > > that a file can > > > > > > > be > > > > > > > > > linked from multiple directories. > > > > > > > > > > > > > > > > This has some based when considering filehandles for > > > > > > > > non- > > > > > > > > directories. > > > > > > > > However the original problem was with filehandles for > > > > > > > > directories..... > > > > > > > > > > > > > > This was just an example of why FH might be different, I > > > > > > > don't think we > > > > > > > depend on it for the parent information anymore. Malahal > > > > > > > listed some other > > > > > > > reasons for having different FH for the same file. I > > > > > > > believe > > > > > > > that Ganesha > > > > > > > split the FH to the key portion (the unique id of the > > > > > > > file) > > > > > > > and some other > > > > > > > information that is file system dependent. If the NFS > > > > > > > client > > > > > > > can not > > > > > > > handle the spec definition of FH maybe the spec should be > > > > > > > updated to > > > > > > > something like Ganesha does. > > > > > > > Marc. > > > > > > > > > > > > Do we know exactly why the FH changed in this particular > > > > > > circumstance? > > > > > > Is there some way to find out? > > > > > > > > > > > > The NFSv3 spec has been updated - it is called "NFSv4" (now > > > > > > 4.2). It > > > > > > says a lot more things about filehandles, but even there, > > > > > > the > > > > > > spec is > > > > > > only as good as the what has been implemented and > > > > > > tested. I'm > > > > > > pretty > > > > > > sure that there are parts of the FH spec that have never > > > > > > been > > > > > > put into > > > > > > practice - so using them would not be wise (I'm > > > > > > particularly > > > > > > thinking of > > > > > > volatile file handles). > > > > > > > > > > > > For better or worse, Linux requires directories to have > > > > > > stable > > > > > > filehandles for NFSv3. This requirement is effectively > > > > > > imposed > > > > > > by the > > > > > > dcache. If there were some way to reliably check if two > > > > > > filehandles > > > > > > referred to the same directory, then we could relax that > > > > > > restriction, > > > > > > but I don't think there is. > > > > > > > > > > > > I think the other possible reason mentioned for changing > > > > > > the > > > > > > filehandle > > > > > > is to support migration. NFSv3 definitely doesn't support > > > > > > migration. > > > > > > NFSv4 explicitly tries to. > > > > > > > > > > > > NeilBrown > > > > > > > > > > > > > > > > > > > > > Adding the parent inode to the FH help finding the > > > > > > > > > the > > > > > > > > > name of the > > > > > > > file by > > > > > > > > > looking for the file inode in > > > > > > > > > the parent directoy. > > > > > > > > > > > > > > > > > > > > > > > > > ....and directories have a ".." link, obviating the > > > > > > > > need to > > > > > > > > store parent > > > > > > > > information in the filehandle. > > > > > > > > > > > > > > > > NeilBrown > > > > > > > > > > > > > > > > > > > > > > > > > Marc. > > > > > > > > > > > > > > > > > > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 > > > > > > > > > 05:15:42 PM: > > > > > > > > > > > > > > > > > > > From: Trond Myklebust <trondmy@hammerspace.com> > > > > > > > > > > To: "mbenjami@redhat.com" <mbenjami@redhat.com> > > > > > > > > > > Cc: "bcodding@redhat.com" <bcodding@redhat.com>, " > > > > > > > > > > malahal@gmail.com" > > > > > > > > > > <malahal@gmail.com>, "linux-nfs@vger.kernel.org" > > > > > > > > > <linux-nfs@vger.kernel.org> > > > > > > > > > > Date: 11/02/2018 05:15 PM > > > > > > > > > > Subject: Re: "(deleted)" directories > > > > > > > > > > Sent by: linux-nfs-owner@vger.kernel.org > > > > > > > > > > > > > > > > > > > > On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin > > > > > > > > > > wrote: > > > > > > > > > > > It sounds like a pretty good one, that goes to > > > > > > > > > > > the > > > > > > > > > > > heart of what a > > > > > > > > > > > specification is > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > While admittedly it is (still) Dia de los Muertos > > > > > > > > > > today, I would > > > > > > > think > > > > > > > > > > that someone who resurrected a part of the NFSv3 > > > > > > > > > > spec > > > > > > > > > > that has been > > > > > > > > > > unused for the full 23 years of its existence might > > > > > > > > > > have some > > > > > > > > > > explanation for why they did so? > > > > > > > > > > > > > > > > > > > > IOW: not being of a particularly religious > > > > > > > > > > persuasion, > > > > > > > > > > I usually want > > > > > > > > > > to understand why features are needed rather than > > > > > > > > > > having blind faith > > > > > > > in > > > > > > > > > > the person who wrote the spec. > > > > > > > > > > > > > > > > > > > > > Matt > > > > > > > > > > > > > > > > > > > > > > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < > > > > > > > > > > > trondmy@hammerspace.com> wrote: > > > > > > > > > > > > On Fri, 2018-11-02 at 21:24 +0530, Malahal > > > > > > > > > > > > Naineni > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Ben, NFSv3 RFC1813.txt states: "If two file > > > > > > > > > > > > > handles from the > > > > > > > same > > > > > > > > > > > > > server are equal, they must refer to the same > > > > > > > > > > > > > file, but if > > > > > > > > > > > > > they are > > > > > > > > > > > > > not equal, no conclusions can be drawn." > > > > > > > > > > > > > Ganesha > > > > > > > > > > > > > does return > > > > > > > same > > > > > > > > > > > > > fileid here (inode). > > > > > > > > > > > > > > > > > > > > > > > > > > In NFSv4, they have introduced > > > > > > > > > > > > > "unique_handles" > > > > > > > > > > > > > attribute. I > > > > > > > > > > > > > don't > > > > > > > > > > > > > see > > > > > > > > > > > > > Linux NFS client using this at all though. > > > > > > > > > > > > > > > > > > > > > > > > Why does your server need to have multiple > > > > > > > > > > > > filehandles refer to > > > > > > > the > > > > > > > > > > > > same file, and why do you expect clients to > > > > > > > > > > > > support > > > > > > > > > > > > this? > > > > > > > > > > > > > > > > > > > > > > > > Yes, the spec allows it, but that's not a > > > > > > > > > > > > sufficient reason. > > > > > > > > > > > > > > > > > > > > > > > > > Regards, Malahal. > > > > > > > > > > > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin > > > > > > > > > > > > > Coddington < > > > > > > > > > > > > > bcodding@redhat.com> wrote: > > > > > > > > > > > > > > On 2 Nov 2018, at 1:26, Malahal Naineni > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hi All, we are using NFS-Ganesha with > > > > > > > > > > > > > > > Linux > > > > > > > > > > > > > > > NFS clients. > > > > > > > The > > > > > > > > > > > > > > > client's > > > > > > > > > > > > > > > shell reports the following. Based on > > > > > > > > > > > > > > > lsof, > > > > > > > > > > > > > > > the directory > > > > > > > is > > > > > > > > > > > > > > > marked > > > > > > > > > > > > > > > deleted. "cd to ROOT and cd to the same > > > > > > > > > > > > > > > home > > > > > > > > > > > > > > > directory > > > > > > > fixes > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > issue. The client behaves as though the > > > > > > > > > > > > > > > directory is > > > > > > > deleted > > > > > > > > > > > > > > > and > > > > > > > > > > > > > > > recreated! Our NFS-Ganesha server > > > > > > > > > > > > > > > implementation uses > > > > > > > > > > > > > > > multiple > > > > > > > > > > > > > > > file > > > > > > > > > > > > > > > handles that point to the same object. > > > > > > > > > > > > > > > NFS > > > > > > > > > > > > > > > spec says this > > > > > > > > > > > > > > > should > > > > > > > > > > > > > > > be > > > > > > > > > > > > > > > fine, but Linux NFS seems to be broken in > > > > > > > > > > > > > > > this regard. > > > > > > > > > > > > > > > tcpdump > > > > > > > > > > > > > > > does > > > > > > > > > > > > > > > indicate file handle change (note that > > > > > > > > > > > > > > > all > > > > > > > > > > > > > > > file handles are > > > > > > > > > > > > > > > permanent, > > > > > > > > > > > > > > > meaning they are valid at the server any > > > > > > > > > > > > > > > time) around this > > > > > > > > > > > > > > > issue > > > > > > > > > > > > > > > time. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > "shell-init: error retrieving current > > > > > > > > > > > > > > > directory: getcwd: > > > > > > > > > > > > > > > cannot > > > > > > > > > > > > > > > access > > > > > > > > > > > > > > > parent directories: No such file or > > > > > > > > > > > > > > > directory" > > > > > > > > > > > > > > > sh 112544 malahal cwd > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > DIR > > > > > > > > > > > > > > > 0,67 > > > > > > > > > > > > > > > 65536 45605209 /home/malahal > > > > > > > > > > > > > > > (deleted) > > > > > > > > > > > > > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Function nfs_prime_dcache() seems to > > > > > > > > > > > > > > > invalidate the dcache > > > > > > > > > > > > > > > entry > > > > > > > > > > > > > > > if > > > > > > > > > > > > > > > nfs_same_file() returns false. > > > > > > > > > > > > > > > nfs_same_file() does seem to > > > > > > > > > > > > > > > return > > > > > > > > > > > > > > > false with the following change, if I > > > > > > > > > > > > > > > read it > > > > > > > > > > > > > > > correctly, if > > > > > > > > > > > > > > > there > > > > > > > > > > > > > > > is a > > > > > > > > > > > > > > > file handle change. Can this be the > > > > > > > > > > > > > > > source of > > > > > > > > > > > > > > > my issue? It > > > > > > > > > > > > > > > seems > > > > > > > > > > > > > > > that > > > > > > > > > > > > > > > the client should do this only if the > > > > > > > > > > > > > > > file > > > > > > > > > > > > > > > handle is NOT > > > > > > > > > > > > > > > valid > > > > > > > > > > > > > > > (e.g. > > > > > > > > > > > > > > > if it gets ESTALE), right? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The following commit seems to assume that > > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > objects are > > > > > > > > > > > > > > > different if > > > > > > > > > > > > > > > they have different file handles! > > > > > > > > > > > > > > > commit > > > > > > > > > > > > > > > 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > > > > > > > > > > > > > Author: Trond Myklebust < > > > > > > > > > > > > > > > trond.myklebust@primarydata.com> > > > > > > > > > > > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > NFS: Fix inode corruption in > > > > > > > > > > > > > > > nfs_prime_dcache() > > > > > > > > > > > > > > > > > > > > > > > > > > > > My understanding is that for NFSv3 we have > > > > > > > > > > > > > > to > > > > > > > > > > > > > > assume that > > > > > > > > > > > > > > distinct > > > > > > > > > > > > > > filehandles are distinct objects, but maybe > > > > > > > > > > > > > > I'm > > > > > > > > > > > > > > wrong about > > > > > > > > > > > > > > this. > > > > > > > > > > > > > > > > > > > > > > > > > > > > For NFSv4.x, we can follow the guidance in > > > > > > > > > > > > > > RFCs > > > > > > > > > > > > > > 5661 or 7530 > > > > > > > > > > > > > > section 10.3.4 > > > > > > > > > > > > > > to determine if the differing filehandles > > > > > > > > > > > > > > are > > > > > > > > > > > > > > the same > > > > > > > object, > > > > > > > > > > > > > > specifically > > > > > > > > > > > > > > the fileid recommended attribute needs to > > > > > > > > > > > > > > be > > > > > > > > > > > > > > implemented. Is > > > > > > > > > > > > > > Ganesha > > > > > > > > > > > > > > returning the same fileid for both > > > > > > > > > > > > > > filehandles? > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ben > > > > > > > > > > > > -- > > > > > > > > > > > > Trond Myklebust > > > > > > > > > > > > CTO, Hammerspace Inc > > > > > > > > > > > > 4300 El Camino Real, Suite 105 > > > > > > > > > > > > Los Altos, CA 94022 > > > > > > > > > > > > www.hammer.space > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Trond Myklebust > > > > > > > > > > CTO, Hammerspace Inc > > > > > > > > > > 4300 El Camino Real, Suite 105 > > > > > > > > > > Los Altos, CA 94022 > > > > > > > > > > www.hammer.space > > > > > > > > > > > > > > > > > > > > > > > > > > > > [attachment "signature.asc" deleted by Marc > > > > > > > > Eshel/Almaden/IBM] > > Trond Myklebust > > CTO, Hammerspace Inc > > 4300 El Camino Real, Suite 105 > > Los Altos, CA 94022 > > www.hammer.space > > > > -- > > Trond Myklebust > > Linux NFS client maintainer, Hammerspace > > trond.myklebust@hammerspace.com -- Trond Myklebust CTO, Hammerspace Inc 4300 El Camino Real, Suite 105 Los Altos, CA 94022 www.hammer.space ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-05 7:40 ` Malahal Naineni 2018-11-05 13:45 ` Trond Myklebust @ 2018-11-06 8:43 ` Mkrtchyan, Tigran 2018-11-06 19:25 ` Bill Baker 1 sibling, 1 reply; 24+ messages in thread From: Mkrtchyan, Tigran @ 2018-11-06 8:43 UTC (permalink / raw) To: Malahal Naineni Cc: NeilBrown, Marc Eshel, Benjamin Coddington, linux-nfs, linux-nfs-owner, Matt Benjamin, trondmy, bill.baker ----- Original Message ----- > From: "Malahal Naineni" <malahal@gmail.com> > To: "NeilBrown" <neilb@suse.com> > Cc: "Marc Eshel" <eshel@us.ibm.com>, "Benjamin Coddington" <bcodding@redhat.com>, "linux-nfs" > <linux-nfs@vger.kernel.org>, linux-nfs-owner@vger.kernel.org, "Matt Benjamin" <mbenjami@redhat.com>, "trondmy" > <trondmy@hammerspace.com> > Sent: Monday, November 5, 2018 8:40:16 AM > Subject: Re: "(deleted)" directories >>> My reading of section 10.3.4 of RFC7530 suggests that the client should > generally compare fsid and fileid to see if two different filehandles refer to > the same object or not. > > Section 10.3.4 is for files only correct? The issue here is for > directories. Also, Trond clearly pointed that Linux breaks section > 10.3.4 from his email stating "We treat always different filehandles > as if they refer to different > files. It has long been the case that snapshots from several vendors > are encoded to look like the same file (same fileid + same fsid) and > differing only by filehandle. If we were to try to consolidate those > inodes we would end up corrupting application data." > > We don't respect either NFSv3 or NFSv4 RFCs in this regard! In general, you are right. The spec is our source of truth. However, I wouldn't rely on any word in the rfc if it wasn't implemented and tested by various client and server implementations (this is IETF requirement!). That's why we run multiple bake-a-thons and connectathons. Moreover, from my experience of having outsider server implementation I can say, that even if desired change end-up in upstream kernel, there are clients around that don't support that functionality and will force you to solve have a solutionin the server side. BTW, at the one of the connectathons, Bill Baker from Oracle was telling about migration implementation on Solaris server. And one of the issues that thy suffer from was that file handles was in host native format and they had a problems in migration from SPARC to x64. I am supersized, that you run into the same issue. Regards, Tigran. > > Regards, Malahal. > > > Regards, Malahal. > On Mon, Nov 5, 2018 at 10:39 AM NeilBrown <neilb@suse.com> wrote: >> >> On Mon, Nov 05 2018, Malahal Naineni wrote: >> >> >> Do we know exactly why the FH changed in this particular circumstance? >> > >> > In this instance, this is due to a code bug but obviously, there are >> > legitimate cases where this occur with Ganesha. >> >> Good to know that bug has been found, and presumably fixed. >> It is not obvious to me that there are any such legitimate cases for >> directories. >> >> > >> >> (I'm particularly thinking of volatile file handles). >> > >> > NFS4 RFC has "unique filehandles" concept as well. Linux NFS client >> > doesn't seem to use "unique filehandles" attribute as well. >> >> A client doesn't need to use that attribute. >> My reading of section 10.3.4 of RFC7530 suggests that the client should >> generally compare fsid and fileid to see if two different filehandles refer to >> the same object or not. >> If unique_handles is known to be set for a given fsid, then different >> filehandles imply different files, without bothering to check the >> fileid. >> So the use of unique_handles is an optimization. >> >> I haven't looked at the Linux/NFS code to see if it conforms to 10.3.4. >> >> NeilBrown >> >> >> > >> > On Mon, Nov 5, 2018 at 6:02 AM NeilBrown <neilb@suse.com> wrote: >> >> >> >> On Sun, Nov 04 2018, Marc Eshel wrote: >> >> >> >> > linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 10:31:29 PM: >> >> > >> >> >> From: NeilBrown <neilb@suse.com> >> >> >> To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust >> >> > <trondmy@hammerspace.com> >> >> >> Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux-nfs >> >> >> \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs- >> >> >> owner@vger.kernel.org, "malahal\@gmail.com" <malahal@gmail.com>, >> >> >> "mbenjami\@redhat.com" <mbenjami@redhat.com> >> >> >> Date: 11/03/2018 10:41 PM >> >> >> Subject: Re: "(deleted)" directories >> >> >> Sent by: linux-nfs-owner@vger.kernel.org >> >> >> >> >> >> On Fri, Nov 02 2018, Marc Eshel wrote: >> >> >> >> >> >> > One reason to have different FHs for the same file is that a file can >> >> > be >> >> >> > linked from multiple directories. >> >> >> >> >> >> This has some based when considering filehandles for non-directories. >> >> >> However the original problem was with filehandles for directories..... >> >> > >> >> > This was just an example of why FH might be different, I don't think we >> >> > depend on it for the parent information anymore. Malahal listed some other >> >> > reasons for having different FH for the same file. I believe that Ganesha >> >> > split the FH to the key portion (the unique id of the file) and some other >> >> > information that is file system dependent. If the NFS client can not >> >> > handle the spec definition of FH maybe the spec should be updated to >> >> > something like Ganesha does. >> >> > Marc. >> >> >> >> Do we know exactly why the FH changed in this particular circumstance? >> >> Is there some way to find out? >> >> >> >> The NFSv3 spec has been updated - it is called "NFSv4" (now 4.2). It >> >> says a lot more things about filehandles, but even there, the spec is >> >> only as good as the what has been implemented and tested. I'm pretty >> >> sure that there are parts of the FH spec that have never been put into >> >> practice - so using them would not be wise (I'm particularly thinking of >> >> volatile file handles). >> >> >> >> For better or worse, Linux requires directories to have stable >> >> filehandles for NFSv3. This requirement is effectively imposed by the >> >> dcache. If there were some way to reliably check if two filehandles >> >> referred to the same directory, then we could relax that restriction, >> >> but I don't think there is. >> >> >> >> I think the other possible reason mentioned for changing the filehandle >> >> is to support migration. NFSv3 definitely doesn't support migration. >> >> NFSv4 explicitly tries to. >> >> >> >> NeilBrown >> >> >> >> >> >> > >> >> >> >> >> >> > Adding the parent inode to the FH help finding the the name of the >> >> > file by >> >> >> > looking for the file inode in >> >> >> > the parent directoy. >> >> >> > >> >> >> >> >> >> ....and directories have a ".." link, obviating the need to store parent >> >> >> information in the filehandle. >> >> >> >> >> >> NeilBrown >> >> >> >> >> >> >> >> >> > Marc. >> >> >> > >> >> >> > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: >> >> >> > >> >> >> >> From: Trond Myklebust <trondmy@hammerspace.com> >> >> >> >> To: "mbenjami@redhat.com" <mbenjami@redhat.com> >> >> >> >> Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com" >> >> >> >> <malahal@gmail.com>, "linux-nfs@vger.kernel.org" >> >> >> > <linux-nfs@vger.kernel.org> >> >> >> >> Date: 11/02/2018 05:15 PM >> >> >> >> Subject: Re: "(deleted)" directories >> >> >> >> Sent by: linux-nfs-owner@vger.kernel.org >> >> >> >> >> >> >> >> On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: >> >> >> >> > It sounds like a pretty good one, that goes to the heart of what a >> >> >> >> > specification is >> >> >> >> > >> >> >> >> >> >> >> >> While admittedly it is (still) Dia de los Muertos today, I would >> >> > think >> >> >> >> that someone who resurrected a part of the NFSv3 spec that has been >> >> >> >> unused for the full 23 years of its existence might have some >> >> >> >> explanation for why they did so? >> >> >> >> >> >> >> >> IOW: not being of a particularly religious persuasion, I usually want >> >> >> >> to understand why features are needed rather than having blind faith >> >> > in >> >> >> >> the person who wrote the spec. >> >> >> >> >> >> >> >> > Matt >> >> >> >> > >> >> >> >> > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < >> >> >> >> > trondmy@hammerspace.com> wrote: >> >> >> >> > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: >> >> >> >> > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the >> >> > same >> >> >> >> > > > server are equal, they must refer to the same file, but if >> >> >> >> > > > they are >> >> >> >> > > > not equal, no conclusions can be drawn." Ganesha does return >> >> > same >> >> >> >> > > > fileid here (inode). >> >> >> >> > > > >> >> >> >> > > > In NFSv4, they have introduced "unique_handles" attribute. I >> >> >> >> > > > don't >> >> >> >> > > > see >> >> >> >> > > > Linux NFS client using this at all though. >> >> >> >> > > >> >> >> >> > > Why does your server need to have multiple filehandles refer to >> >> > the >> >> >> >> > > same file, and why do you expect clients to support this? >> >> >> >> > > >> >> >> >> > > Yes, the spec allows it, but that's not a sufficient reason. >> >> >> >> > > >> >> >> >> > > > Regards, Malahal. >> >> >> >> > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < >> >> >> >> > > > bcodding@redhat.com> wrote: >> >> >> >> > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: >> >> >> >> > > > > >> >> >> >> > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. >> >> > The >> >> >> >> > > > > > client's >> >> >> >> > > > > > shell reports the following. Based on lsof, the directory >> >> > is >> >> >> >> > > > > > marked >> >> >> >> > > > > > deleted. "cd to ROOT and cd to the same home directory >> >> > fixes >> >> >> >> > > > > > the >> >> >> >> > > > > > issue. The client behaves as though the directory is >> >> > deleted >> >> >> >> > > > > > and >> >> >> >> > > > > > recreated! Our NFS-Ganesha server implementation uses >> >> >> >> > > > > > multiple >> >> >> >> > > > > > file >> >> >> >> > > > > > handles that point to the same object. NFS spec says this >> >> >> >> > > > > > should >> >> >> >> > > > > > be >> >> >> >> > > > > > fine, but Linux NFS seems to be broken in this regard. >> >> >> >> > > > > > tcpdump >> >> >> >> > > > > > does >> >> >> >> > > > > > indicate file handle change (note that all file handles are >> >> >> >> > > > > > permanent, >> >> >> >> > > > > > meaning they are valid at the server any time) around this >> >> >> >> > > > > > issue >> >> >> >> > > > > > time. >> >> >> >> > > > > > >> >> >> >> > > > > > "shell-init: error retrieving current directory: getcwd: >> >> >> >> > > > > > cannot >> >> >> >> > > > > > access >> >> >> >> > > > > > parent directories: No such file or directory" >> >> >> >> > > > > > sh 112544 malahal cwd DIR >> >> >> >> > > > > > 0,67 >> >> >> >> > > > > > 65536 45605209 /home/malahal (deleted) >> >> >> >> > > > > > (10.120.154.42:/nfs/malahal-export/) >> >> >> >> > > > > > >> >> >> >> > > > > > Function nfs_prime_dcache() seems to invalidate the dcache >> >> >> >> > > > > > entry >> >> >> >> > > > > > if >> >> >> >> > > > > > nfs_same_file() returns false. nfs_same_file() does seem to >> >> >> >> > > > > > return >> >> >> >> > > > > > false with the following change, if I read it correctly, if >> >> >> >> > > > > > there >> >> >> >> > > > > > is a >> >> >> >> > > > > > file handle change. Can this be the source of my issue? It >> >> >> >> > > > > > seems >> >> >> >> > > > > > that >> >> >> >> > > > > > the client should do this only if the file handle is NOT >> >> >> >> > > > > > valid >> >> >> >> > > > > > (e.g. >> >> >> >> > > > > > if it gets ESTALE), right? >> >> >> >> > > > > > >> >> >> >> > > > > > The following commit seems to assume that the objects are >> >> >> >> > > > > > different if >> >> >> >> > > > > > they have different file handles! >> >> >> >> > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 >> >> >> >> > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> >> >> >> >> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 >> >> >> >> > > > > > >> >> >> >> > > > > > NFS: Fix inode corruption in nfs_prime_dcache() >> >> >> >> > > > > >> >> >> >> > > > > My understanding is that for NFSv3 we have to assume that >> >> >> >> > > > > distinct >> >> >> >> > > > > filehandles are distinct objects, but maybe I'm wrong about >> >> >> >> > > > > this. >> >> >> >> > > > > >> >> >> >> > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 >> >> >> >> > > > > section 10.3.4 >> >> >> >> > > > > to determine if the differing filehandles are the same >> >> > object, >> >> >> >> > > > > specifically >> >> >> >> > > > > the fileid recommended attribute needs to be implemented. Is >> >> >> >> > > > > Ganesha >> >> >> >> > > > > returning the same fileid for both filehandles? >> >> >> >> > > > > >> >> >> >> > > > > Ben >> >> >> >> > > -- >> >> >> >> > > Trond Myklebust >> >> >> >> > > CTO, Hammerspace Inc >> >> >> >> > > 4300 El Camino Real, Suite 105 >> >> >> >> > > Los Altos, CA 94022 >> >> >> >> > > www.hammer.space >> >> >> >> > > >> >> >> >> > > >> >> >> >> > >> >> >> >> > >> >> >> >> -- >> >> >> >> Trond Myklebust >> >> >> >> CTO, Hammerspace Inc >> >> >> >> 4300 El Camino Real, Suite 105 >> >> >> >> Los Altos, CA 94022 >> >> >> >> www.hammer.space >> >> >> >> >> >> >> >> > > >> >> [attachment "signature.asc" deleted by Marc Eshel/Almaden/IBM] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-06 8:43 ` Mkrtchyan, Tigran @ 2018-11-06 19:25 ` Bill Baker 2018-11-06 22:17 ` Rick Macklem 0 siblings, 1 reply; 24+ messages in thread From: Bill Baker @ 2018-11-06 19:25 UTC (permalink / raw) To: Mkrtchyan, Tigran, Malahal Naineni Cc: NeilBrown, Marc Eshel, Benjamin Coddington, linux-nfs, linux-nfs-owner, Matt Benjamin, trondmy On 11/6/18 2:43 AM, Mkrtchyan, Tigran wrote: > > > ----- Original Message ----- >> From: "Malahal Naineni" <malahal@gmail.com> >> To: "NeilBrown" <neilb@suse.com> >> Cc: "Marc Eshel" <eshel@us.ibm.com>, "Benjamin Coddington" <bcodding@redhat.com>, "linux-nfs" >> <linux-nfs@vger.kernel.org>, linux-nfs-owner@vger.kernel.org, "Matt Benjamin" <mbenjami@redhat.com>, "trondmy" >> <trondmy@hammerspace.com> >> Sent: Monday, November 5, 2018 8:40:16 AM >> Subject: Re: "(deleted)" directories > >>>> My reading of section 10.3.4 of RFC7530 suggests that the client should >> generally compare fsid and fileid to see if two different filehandles refer to >> the same object or not. >> >> Section 10.3.4 is for files only correct? The issue here is for >> directories. Also, Trond clearly pointed that Linux breaks section >> 10.3.4 from his email stating "We treat always different filehandles >> as if they refer to different >> files. It has long been the case that snapshots from several vendors >> are encoded to look like the same file (same fileid + same fsid) and >> differing only by filehandle. If we were to try to consolidate those >> inodes we would end up corrupting application data." >> >> We don't respect either NFSv3 or NFSv4 RFCs in this regard! > > > In general, you are right. The spec is our source of truth. However, > I wouldn't rely on any word in the rfc if it wasn't implemented and > tested by various client and server implementations (this is IETF requirement!). > That's why we run multiple bake-a-thons and connectathons. Moreover, > from my experience of having outsider server implementation I can say, > that even if desired change end-up in upstream kernel, there are clients > around that don't support that functionality and will force you to solve > have a solutionin the server side. > > BTW, at the one of the connectathons, Bill Baker from Oracle was telling > about migration implementation on Solaris server. And one of the issues > that thy suffer from was that file handles was in host native format and > they had a problems in migration from SPARC to x64. I am supersized, that > you run into the same issue. > Yeah, though that is just an error in implementation. ZFS stores it's persistent data in network byte order so that the a filesystem can be imported on machines with different byte orders. +1 for the ZFS guys, but sadly the NFS server failed to xdr encode the file handle need to enable FH portability between architectures. -1 for the (Solaris) NFS guys. Fixed for 4.1 file handles, though. <snip> -- Bill Baker, Oracle Linux NFS development ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-06 19:25 ` Bill Baker @ 2018-11-06 22:17 ` Rick Macklem 0 siblings, 0 replies; 24+ messages in thread From: Rick Macklem @ 2018-11-06 22:17 UTC (permalink / raw) To: Bill Baker, Mkrtchyan, Tigran, Malahal Naineni Cc: NeilBrown, Marc Eshel, Benjamin Coddington, linux-nfs, linux-nfs-owner, Matt Benjamin, trondmy I'll admit I lost the indentations indicating who wrote what, but my comment is fairly generic w.r.t. this. >> From: "Malahal Naineni" <malahal@gmail.com> >> To: "NeilBrown" <neilb@suse.com> >> Cc: "Marc Eshel" <eshel@us.ibm.com>, "Benjamin Coddington" <bcodding@redhat.com>, "linux-nfs" >> <linux-nfs@vger.kernel.org>, linux-nfs-owner@vger.kernel.org, "Matt Benjamin" <mbenjami@redhat.com>, "trondmy" >> <trondmy@hammerspace.com> >> Sent: Monday, November 5, 2018 8:40:16 AM >> Subject: Re: "(deleted)" directories > >>>> My reading of section 10.3.4 of RFC7530 suggests that the client should >> generally compare fsid and fileid to see if two different filehandles refer to >> the same object or not. >> >> Section 10.3.4 is for files only correct? The issue here is for >> directories. Also, Trond clearly pointed that Linux breaks section >> 10.3.4 from his email stating "We treat always different filehandles >> as if they refer to different >> files. It has long been the case that snapshots from several vendors >> are encoded to look like the same file (same fileid + same fsid) and >> differing only by filehandle. If we were to try to consolidate those >> inodes we would end up corrupting application data." >> >> We don't respect either NFSv3 or NFSv4 RFCs in this regard! Part of the question for NFSv4 is whether or not the server returns "unique_handles == true"? For example, if Netapp filers return "unique_handles" as true, then they are fine for NFSv4 using the same (fsid, fileno) for two distinct files. (Second bullet point in RFC 7530 Sec. 10.3.4.) Is there an example (like Netapp) where the only difference is the file handle and the server returns "unique_handles == false"? One approach for a client might be the following: - if NFSv3 - set unique_handles true - if file handles same - same file object else if unique_handles == true - different file object else - if fsid or fileno not same - different file object else - same file object Btw, Sec. 10.3.4 of RFC7530 discusses file objects, so I'd think it is meant to apply to directories as well as regular files. (It does talk about "file data", but I think that can refer to directory contents?) I'll admit I can't remember what the behaviour of the FreeBSD NFS client is for non-unique file handles, but I might take a look and try to make it use the above algorithm. rick ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-05 0:32 ` NeilBrown 2018-11-05 4:41 ` Malahal Naineni @ 2018-11-05 4:48 ` Marc Eshel 1 sibling, 0 replies; 24+ messages in thread From: Marc Eshel @ 2018-11-05 4:48 UTC (permalink / raw) To: NeilBrown Cc: bcodding, linux-nfs, linux-nfs-owner, malahal, mbenjami, Trond Myklebust This thread got split, Malahal is trying to resolve a specific problem and he know much more about it so I will butt out and let him continue to drive his thread. Marc. From: NeilBrown <neilb@suse.com> To: Marc Eshel <eshel@us.ibm.com> Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux-nfs\@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs-owner@vger.kernel.org, "malahal\@gmail.com" <malahal@gmail.com>, "mbenjami\@redhat.com" <mbenjami@redhat.com>, "Trond Myklebust" <trondmy@hammerspace.com> Date: 11/04/2018 04:33 PM Subject: Re: "(deleted)" directories Sent by: linux-nfs-owner@vger.kernel.org On Sun, Nov 04 2018, Marc Eshel wrote: > linux-nfs-owner@vger.kernel.org wrote on 11/03/2018 10:31:29 PM: > >> From: NeilBrown <neilb@suse.com> >> To: Marc Eshel <eshel@us.ibm.com>, Trond Myklebust > <trondmy@hammerspace.com> >> Cc: "bcodding\@redhat.com" <bcodding@redhat.com>, "linux-nfs >> \@vger.kernel.org" <linux-nfs@vger.kernel.org>, linux-nfs- >> owner@vger.kernel.org, "malahal\@gmail.com" <malahal@gmail.com>, >> "mbenjami\@redhat.com" <mbenjami@redhat.com> >> Date: 11/03/2018 10:41 PM >> Subject: Re: "(deleted)" directories >> Sent by: linux-nfs-owner@vger.kernel.org >> >> On Fri, Nov 02 2018, Marc Eshel wrote: >> >> > One reason to have different FHs for the same file is that a file can > be >> > linked from multiple directories. >> >> This has some based when considering filehandles for non-directories. >> However the original problem was with filehandles for directories..... > > This was just an example of why FH might be different, I don't think we > depend on it for the parent information anymore. Malahal listed some other > reasons for having different FH for the same file. I believe that Ganesha > split the FH to the key portion (the unique id of the file) and some other > information that is file system dependent. If the NFS client can not > handle the spec definition of FH maybe the spec should be updated to > something like Ganesha does. > Marc. Do we know exactly why the FH changed in this particular circumstance? Is there some way to find out? The NFSv3 spec has been updated - it is called "NFSv4" (now 4.2). It says a lot more things about filehandles, but even there, the spec is only as good as the what has been implemented and tested. I'm pretty sure that there are parts of the FH spec that have never been put into practice - so using them would not be wise (I'm particularly thinking of volatile file handles). For better or worse, Linux requires directories to have stable filehandles for NFSv3. This requirement is effectively imposed by the dcache. If there were some way to reliably check if two filehandles referred to the same directory, then we could relax that restriction, but I don't think there is. I think the other possible reason mentioned for changing the filehandle is to support migration. NFSv3 definitely doesn't support migration. NFSv4 explicitly tries to. NeilBrown > >> >> > Adding the parent inode to the FH help finding the the name of the > file by >> > looking for the file inode in >> > the parent directoy. >> > >> >> ....and directories have a ".." link, obviating the need to store parent >> information in the filehandle. >> >> NeilBrown >> >> >> > Marc. >> > >> > linux-nfs-owner@vger.kernel.org wrote on 11/02/2018 05:15:42 PM: >> > >> >> From: Trond Myklebust <trondmy@hammerspace.com> >> >> To: "mbenjami@redhat.com" <mbenjami@redhat.com> >> >> Cc: "bcodding@redhat.com" <bcodding@redhat.com>, "malahal@gmail.com" >> >> <malahal@gmail.com>, "linux-nfs@vger.kernel.org" >> > <linux-nfs@vger.kernel.org> >> >> Date: 11/02/2018 05:15 PM >> >> Subject: Re: "(deleted)" directories >> >> Sent by: linux-nfs-owner@vger.kernel.org >> >> >> >> On Fri, 2018-11-02 at 18:07 -0400, Matt Benjamin wrote: >> >> > It sounds like a pretty good one, that goes to the heart of what a >> >> > specification is >> >> > >> >> >> >> While admittedly it is (still) Dia de los Muertos today, I would > think >> >> that someone who resurrected a part of the NFSv3 spec that has been >> >> unused for the full 23 years of its existence might have some >> >> explanation for why they did so? >> >> >> >> IOW: not being of a particularly religious persuasion, I usually want >> >> to understand why features are needed rather than having blind faith > in >> >> the person who wrote the spec. >> >> >> >> > Matt >> >> > >> >> > On Fri, Nov 2, 2018 at 4:26 PM, Trond Myklebust < >> >> > trondmy@hammerspace.com> wrote: >> >> > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: >> >> > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the > same >> >> > > > server are equal, they must refer to the same file, but if >> >> > > > they are >> >> > > > not equal, no conclusions can be drawn." Ganesha does return > same >> >> > > > fileid here (inode). >> >> > > > >> >> > > > In NFSv4, they have introduced "unique_handles" attribute. I >> >> > > > don't >> >> > > > see >> >> > > > Linux NFS client using this at all though. >> >> > > >> >> > > Why does your server need to have multiple filehandles refer to > the >> >> > > same file, and why do you expect clients to support this? >> >> > > >> >> > > Yes, the spec allows it, but that's not a sufficient reason. >> >> > > >> >> > > > Regards, Malahal. >> >> > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < >> >> > > > bcodding@redhat.com> wrote: >> >> > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: >> >> > > > > >> >> > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. > The >> >> > > > > > client's >> >> > > > > > shell reports the following. Based on lsof, the directory > is >> >> > > > > > marked >> >> > > > > > deleted. "cd to ROOT and cd to the same home directory > fixes >> >> > > > > > the >> >> > > > > > issue. The client behaves as though the directory is > deleted >> >> > > > > > and >> >> > > > > > recreated! Our NFS-Ganesha server implementation uses >> >> > > > > > multiple >> >> > > > > > file >> >> > > > > > handles that point to the same object. NFS spec says this >> >> > > > > > should >> >> > > > > > be >> >> > > > > > fine, but Linux NFS seems to be broken in this regard. >> >> > > > > > tcpdump >> >> > > > > > does >> >> > > > > > indicate file handle change (note that all file handles are >> >> > > > > > permanent, >> >> > > > > > meaning they are valid at the server any time) around this >> >> > > > > > issue >> >> > > > > > time. >> >> > > > > > >> >> > > > > > "shell-init: error retrieving current directory: getcwd: >> >> > > > > > cannot >> >> > > > > > access >> >> > > > > > parent directories: No such file or directory" >> >> > > > > > sh 112544 malahal cwd DIR >> >> > > > > > 0,67 >> >> > > > > > 65536 45605209 /home/malahal (deleted) >> >> > > > > > (10.120.154.42:/nfs/malahal-export/) >> >> > > > > > >> >> > > > > > Function nfs_prime_dcache() seems to invalidate the dcache >> >> > > > > > entry >> >> > > > > > if >> >> > > > > > nfs_same_file() returns false. nfs_same_file() does seem to >> >> > > > > > return >> >> > > > > > false with the following change, if I read it correctly, if >> >> > > > > > there >> >> > > > > > is a >> >> > > > > > file handle change. Can this be the source of my issue? It >> >> > > > > > seems >> >> > > > > > that >> >> > > > > > the client should do this only if the file handle is NOT >> >> > > > > > valid >> >> > > > > > (e.g. >> >> > > > > > if it gets ESTALE), right? >> >> > > > > > >> >> > > > > > The following commit seems to assume that the objects are >> >> > > > > > different if >> >> > > > > > they have different file handles! >> >> > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 >> >> > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> >> >> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 >> >> > > > > > >> >> > > > > > NFS: Fix inode corruption in nfs_prime_dcache() >> >> > > > > >> >> > > > > My understanding is that for NFSv3 we have to assume that >> >> > > > > distinct >> >> > > > > filehandles are distinct objects, but maybe I'm wrong about >> >> > > > > this. >> >> > > > > >> >> > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 >> >> > > > > section 10.3.4 >> >> > > > > to determine if the differing filehandles are the same > object, >> >> > > > > specifically >> >> > > > > the fileid recommended attribute needs to be implemented. Is >> >> > > > > Ganesha >> >> > > > > returning the same fileid for both filehandles? >> >> > > > > >> >> > > > > Ben >> >> > > -- >> >> > > Trond Myklebust >> >> > > CTO, Hammerspace Inc >> >> > > 4300 El Camino Real, Suite 105 >> >> > > Los Altos, CA 94022 >> >> > > www.hammer.space >> >> > > >> >> > > >> >> > >> >> > >> >> -- >> >> Trond Myklebust >> >> CTO, Hammerspace Inc >> >> 4300 El Camino Real, Suite 105 >> >> Los Altos, CA 94022 >> >> www.hammer.space >> >> >> >> >> [attachment "signature.asc" deleted by Marc Eshel/Almaden/IBM] [attachment "signature.asc" deleted by Marc Eshel/Almaden/IBM] ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-02 20:26 ` Trond Myklebust 2018-11-02 22:07 ` Matt Benjamin @ 2018-11-03 5:00 ` Malahal Naineni 2018-11-03 5:40 ` Trond Myklebust 1 sibling, 1 reply; 24+ messages in thread From: Malahal Naineni @ 2018-11-03 5:00 UTC (permalink / raw) To: trondmy; +Cc: bcodding, linux-nfs > Why does your server need to have multiple filehandles refer to the same file Trond, the first thing I want to understand is that this server's multiple filehandles referring to the same file/directory is the reason for the client's behavior described in this email. I am pretty sure it is, but want a confirmation! Ganesha by design uses one file handle on big endian systems and another one on little endian systems. If a mixed cluster is going to serve the same file system, then the same file will have at least two different file handles. This is probably done to avoid needless host2net and net2host integer conversions. Also, how do servers deal with version changes in file handle? Looks like live migrations will be broken when there is a file handle version change? Regards, Malahal. On Sat, Nov 3, 2018 at 1:56 AM Trond Myklebust <trondmy@hammerspace.com> wrote: > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the same > > server are equal, they must refer to the same file, but if they are > > not equal, no conclusions can be drawn." Ganesha does return same > > fileid here (inode). > > > > In NFSv4, they have introduced "unique_handles" attribute. I don't > > see > > Linux NFS client using this at all though. > > Why does your server need to have multiple filehandles refer to the > same file, and why do you expect clients to support this? > > Yes, the spec allows it, but that's not a sufficient reason. > > > > > Regards, Malahal. > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > > bcodding@redhat.com> wrote: > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > > > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. The > > > > client's > > > > shell reports the following. Based on lsof, the directory is > > > > marked > > > > deleted. "cd to ROOT and cd to the same home directory fixes the > > > > issue. The client behaves as though the directory is deleted and > > > > recreated! Our NFS-Ganesha server implementation uses multiple > > > > file > > > > handles that point to the same object. NFS spec says this should > > > > be > > > > fine, but Linux NFS seems to be broken in this regard. tcpdump > > > > does > > > > indicate file handle change (note that all file handles are > > > > permanent, > > > > meaning they are valid at the server any time) around this issue > > > > time. > > > > > > > > "shell-init: error retrieving current directory: getcwd: cannot > > > > access > > > > parent directories: No such file or directory" > > > > sh 112544 malahal cwd DIR > > > > 0,67 > > > > 65536 45605209 /home/malahal (deleted) > > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > > > Function nfs_prime_dcache() seems to invalidate the dcache entry > > > > if > > > > nfs_same_file() returns false. nfs_same_file() does seem to > > > > return > > > > false with the following change, if I read it correctly, if there > > > > is a > > > > file handle change. Can this be the source of my issue? It seems > > > > that > > > > the client should do this only if the file handle is NOT valid > > > > (e.g. > > > > if it gets ESTALE), right? > > > > > > > > The following commit seems to assume that the objects are > > > > different if > > > > they have different file handles! > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > > > > > > My understanding is that for NFSv3 we have to assume that distinct > > > filehandles are distinct objects, but maybe I'm wrong about this. > > > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 > > > section 10.3.4 > > > to determine if the differing filehandles are the same object, > > > specifically > > > the fileid recommended attribute needs to be implemented. Is > > > Ganesha > > > returning the same fileid for both filehandles? > > > > > > Ben > -- > Trond Myklebust > CTO, Hammerspace Inc > 4300 El Camino Real, Suite 105 > Los Altos, CA 94022 > www.hammer.space > > ^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: "(deleted)" directories 2018-11-03 5:00 ` Malahal Naineni @ 2018-11-03 5:40 ` Trond Myklebust 0 siblings, 0 replies; 24+ messages in thread From: Trond Myklebust @ 2018-11-03 5:40 UTC (permalink / raw) To: malahal; +Cc: bcodding, linux-nfs On Sat, 2018-11-03 at 10:30 +0530, Malahal Naineni wrote: > > Why does your server need to have multiple filehandles refer to the > > same file > > Trond, the first thing I want to understand is that this server's > multiple filehandles referring to the same file/directory is the > reason for the client's behavior described in this email. I am pretty > sure it is, but want a confirmation! We treat always different filehandles as if they refer to different files. It has long been the case that snapshots from several vendors are encoded to look like the same file (same fileid + same fsid) and differing only by filehandle. If we were to try to consolidate those inodes we would end up corrupting application data. Note that this should be consistent with NFSv4, where the state model requires the client to treat state that belongs to different filehandles as if it were referring to different files. The client is allowed to (but not required to) consolidate the file data caches in these cases, however it may not consolidate file delegation, layout, share or byte range locking state. Note also that the Linux VFS's dcache model does not at all support hard linked directories, nor does it really allow for aliased dentries. > Ganesha by design uses one file handle on big endian systems and > another one on little endian systems. If a mixed cluster is going to > serve the same file system, then the same file will have at least two > different file handles. This is probably done to avoid needless > host2net and net2host integer conversions. The Linux client will not treat those filehandles as if they refer to the same inode for the reasons stated above. I understand your performance motivation, but backward compatibility with existing setups must come first. Note that if each filesystem is served from only one server, then everything will work fine, It is when you start to play games with clustering IP addresses or similar that things break, and as I said, the NFSv4 protocol would be a horrible mess for this case. > Also, how do servers deal with version changes in file handle? Looks > like live migrations will be broken when there is a file handle > version change? Sure, but again no existing servers are using that feature and nobody so far has volunteered any interest in working on it. > Regards, Malahal. > > On Sat, Nov 3, 2018 at 1:56 AM Trond Myklebust < > trondmy@hammerspace.com> wrote: > > On Fri, 2018-11-02 at 21:24 +0530, Malahal Naineni wrote: > > > Ben, NFSv3 RFC1813.txt states: "If two file handles from the same > > > server are equal, they must refer to the same file, but if > > > they are > > > not equal, no conclusions can be drawn." Ganesha does return same > > > fileid here (inode). > > > > > > In NFSv4, they have introduced "unique_handles" attribute. I > > > don't > > > see > > > Linux NFS client using this at all though. > > > > Why does your server need to have multiple filehandles refer to the > > same file, and why do you expect clients to support this? > > > > Yes, the spec allows it, but that's not a sufficient reason. > > > > > Regards, Malahal. > > > On Fri, Nov 2, 2018 at 4:35 PM Benjamin Coddington < > > > bcodding@redhat.com> wrote: > > > > On 2 Nov 2018, at 1:26, Malahal Naineni wrote: > > > > > > > > > Hi All, we are using NFS-Ganesha with Linux NFS clients. The > > > > > client's > > > > > shell reports the following. Based on lsof, the directory is > > > > > marked > > > > > deleted. "cd to ROOT and cd to the same home directory fixes > > > > > the > > > > > issue. The client behaves as though the directory is deleted > > > > > and > > > > > recreated! Our NFS-Ganesha server implementation uses > > > > > multiple > > > > > file > > > > > handles that point to the same object. NFS spec says this > > > > > should > > > > > be > > > > > fine, but Linux NFS seems to be broken in this regard. > > > > > tcpdump > > > > > does > > > > > indicate file handle change (note that all file handles are > > > > > permanent, > > > > > meaning they are valid at the server any time) around this > > > > > issue > > > > > time. > > > > > > > > > > "shell-init: error retrieving current directory: getcwd: > > > > > cannot > > > > > access > > > > > parent directories: No such file or directory" > > > > > sh 112544 malahal cwd DIR > > > > > 0,67 > > > > > 65536 45605209 /home/malahal (deleted) > > > > > (10.120.154.42:/nfs/malahal-export/) > > > > > > > > > > Function nfs_prime_dcache() seems to invalidate the dcache > > > > > entry > > > > > if > > > > > nfs_same_file() returns false. nfs_same_file() does seem to > > > > > return > > > > > false with the following change, if I read it correctly, if > > > > > there > > > > > is a > > > > > file handle change. Can this be the source of my issue? It > > > > > seems > > > > > that > > > > > the client should do this only if the file handle is NOT > > > > > valid > > > > > (e.g. > > > > > if it gets ESTALE), right? > > > > > > > > > > The following commit seems to assume that the objects are > > > > > different if > > > > > they have different file handles! > > > > > commit 7dc72d5f7a0ec97a53e126c46e2cbd2560757955 > > > > > Author: Trond Myklebust <trond.myklebust@primarydata.com> > > > > > Date: Thu Sep 22 13:38:52 2016 -0400 > > > > > > > > > > NFS: Fix inode corruption in nfs_prime_dcache() > > > > > > > > My understanding is that for NFSv3 we have to assume that > > > > distinct > > > > filehandles are distinct objects, but maybe I'm wrong about > > > > this. > > > > > > > > For NFSv4.x, we can follow the guidance in RFCs 5661 or 7530 > > > > section 10.3.4 > > > > to determine if the differing filehandles are the same object, > > > > specifically > > > > the fileid recommended attribute needs to be implemented. Is > > > > Ganesha > > > > returning the same fileid for both filehandles? > > > > > > > > Ben > > -- > > Trond Myklebust > > CTO, Hammerspace Inc > > 4300 El Camino Real, Suite 105 > > Los Altos, CA 94022 > > www.hammer.space > > > > -- Trond Myklebust Linux NFS client maintainer, Hammerspace trond.myklebust@hammerspace.com ^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2018-11-06 22:17 UTC | newest] Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-11-02 5:26 "(deleted)" directories Malahal Naineni 2018-11-02 11:05 ` Benjamin Coddington 2018-11-02 15:54 ` Malahal Naineni 2018-11-02 20:26 ` Trond Myklebust 2018-11-02 22:07 ` Matt Benjamin 2018-11-03 0:15 ` Trond Myklebust 2018-11-03 2:38 ` Marc Eshel 2018-11-03 3:27 ` Trond Myklebust 2018-11-03 3:37 ` Marc Eshel 2018-11-04 5:31 ` NeilBrown 2018-11-04 19:47 ` Marc Eshel 2018-11-05 0:32 ` NeilBrown 2018-11-05 4:41 ` Malahal Naineni 2018-11-05 5:09 ` NeilBrown 2018-11-05 7:40 ` Malahal Naineni 2018-11-05 13:45 ` Trond Myklebust 2018-11-05 23:49 ` NeilBrown 2018-11-06 0:37 ` Trond Myklebust 2018-11-06 8:43 ` Mkrtchyan, Tigran 2018-11-06 19:25 ` Bill Baker 2018-11-06 22:17 ` Rick Macklem 2018-11-05 4:48 ` Marc Eshel 2018-11-03 5:00 ` Malahal Naineni 2018-11-03 5:40 ` Trond Myklebust
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.