From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51325) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YhNFG-0001fO-W7 for qemu-devel@nongnu.org; Sun, 12 Apr 2015 15:08:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1YhNFE-0004AR-ON for qemu-devel@nongnu.org; Sun, 12 Apr 2015 15:08:38 -0400 Received: from mail-qk0-x22b.google.com ([2607:f8b0:400d:c09::22b]:35786) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1YhNFE-0004AN-Hz for qemu-devel@nongnu.org; Sun, 12 Apr 2015 15:08:36 -0400 Received: by qkhg7 with SMTP id g7so138730392qkh.2 for ; Sun, 12 Apr 2015 12:08:36 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20150412140956.GN889@ZenIV.linux.org.uk> References: <20140702135258.23882.15100.malonedeb@soybean.canonical.com> <20150410123059.26540.1036.malone@gac.canonical.com> <20150412140956.GN889@ZenIV.linux.org.uk> From: Eric Van Hensbergen Date: Sun, 12 Apr 2015 14:08:15 -0500 Message-ID: Content-Type: multipart/alternative; boundary=001a1134fdb270473805138bb92a Subject: Re: [Qemu-devel] [Bug 1336794] Re: 9pfs does not honor open file handles on unlinked files List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Al Viro Cc: Bug 1336794 <1336794@bugs.launchpad.net>, qemu-devel --001a1134fdb270473805138bb92a Content-Type: text/plain; charset=UTF-8 On Sun, Apr 12, 2015 at 9:09 AM, Al Viro wrote: > On Sun, Apr 12, 2015 at 12:42:35PM -0000, Eric Van Hensbergen wrote: > > > In other words, it only uses the open fd to derrive a path and then > > executes the getattr off of that path. If that path no longer exists > > (because of unlink or remove) then you are hosed. In my understanding, > the > > "work around" I suppose is the so-called 'silly renaming' where > > remove/unlink simply does a rename until all open instances are closed. > > What do you mean, "no longer exists"? Don't confuse path with pathname - > it's a mount,dentry pair. And dentry in question bloody well ought to > still > have the FID associated with it - you shouldn't use the same FID for > TREMOVE and for TREAD/TWRITE. Quite right, the fid is still there, I just don't have an easy way to get at it. vfs_file operations have a direct notion of the fid because they cache it in the struct file private data. dentries have several fids associated with them and stored in thier private data, so I've got to "guess" the right one. In most cases this probably won't cause a problem, but it just feels a bit off. There was a thread a few years back where Miklos was arguing for fstat to pass through struct file information since the the fd given the fstat had a file structure associated with it (which in 9p's case has a direct pointer to the "right" fid): http://lwn.net/Articles/251228/ In any case, I've drafted a quick patch which takes the approach of searching the dentry fid list for the fid, but it doesn't feel like the right answer and I'm fairly certain I need to iterate on it a few times to make sure I haven't hosed something up. > TREMOVE clunks the FID passed to it; on > some servers you really have no choice - server discards the file > completely > and on any FID that used to refer to it you get an error from that point > on. > On those you'd really have to do something like sillyrename - the only > way to keep IO going for a file sitting on such server is to have it > visible somewhere. Normal fs(4) is that way; e.g. u9fs(4) isn't - there > FID maps to opened file descriptor on host and TREMOVE on another FID > doesn't break it, as long as host supports IO on opened-but-unlinked files. > I don't remember where qemu 9pfs falls in that respect, but I'd expect it > to be more like u9fs... > > Sort of, the servers in kvmtool and qemu (and diod) have a fid with the open handle. However, all of them seem to implement getattr assuming they have to re-walk to the file. In order to test my aforementioned changes to the client, I also did a quick patch to kvmtool which checks and sees if the fid it receives has an fd and just uses fstat instead of lstat. Patch is here at the moment, I'll send upstream once I'm happy with the client side changes and look into creating a patch for qemu/diod: https://github.com/ericvh/linux-kvm/commit/2fa2f7e728ac08a7d9006516870db1a986aa6acc > Which FID are you passing to server on unlink()? > > Unlink/remove look to be getting a proper fid (in other words, not using the one that is open). The problem is that getattr is using a reference fid (an open fid that's already walked to the name). From a protocol semantics perspective the fixes mentioned above probably don't help that we may have some dangling unopen fids pointing at a name that is no longer valid, but that is just a consequence of the imperfect nature of the mapping of dentries to fids and I'm not sure it does any harm. A trace from the original problem on diod (which appears to not implement unlink and is falling back to remove). diod: P9_TWALK tag 1 fid 1 newfid 2 nwname 1 'foobar' diod: P9_RLERROR tag 1 ecode 2 diod: P9_TWALK tag 1 fid 1 newfid 2 nwname 0 diod: P9_RWALK tag 1 nwqid 0e diod: P9_TLCREATE tag 1 fid 2 name 'foobar' flags 0x8042 mode 0100644 gid 0 diod: P9_RLCREATE tag 1 qid (000000000012524b 0 '') iounit 0 diod: P9_TWALK tag 1 fid 1 newfid 3 nwname 1 'foobar' diod: P9_RWALK tag 1 nwqid 1 (000000000012524b 0 '') diod: P9_TGETATTR tag 1 fid 3 request_mask 0x17ff diod: P9_RGETATTR tag 1 valid 0x17ff qid (000000000012524b 0 '') mode 0100644 uid 0 gid 0 nlink 1 rdev 0 size 0 blksize 4096 blocks 0 atime Mon Apr 6 11:11:08 2015 mtime Mon Apr 6 11:11:08 2015 ctime Mon Apr 6 11:11:08 2015 btime X gen 0 data_version X diod: P9_TUNLINKAT tag 1 dirfid 1 name 'foobar' flags 0 diod: P9_RLERROR tag 1 ecode 95 diod: P9_TWALK tag 1 fid 3 newfid 4 nwname 0 diod: P9_RWALK tag 1 nwqid 0 diod: P9_TREMOVE tag 1 fid 4 diod: P9_RREMOVE tag 1 diod: P9_TGETATTR tag 1 fid 3 request_mask 0x3fff diod: P9_RLERROR tag 1 ecode 2 diod: P9_TCLUNK tag 1 fid 2 diod: P9_RCLUNK tag 1 diod: P9_TCLUNK tag 1 fid 3 diod: P9_RCLUNK tag 1 The client cloning 3 to 4 before the remove seems a bit unnecessary, but is probably done in the case that the remove fails on the server so that we still have a dentry reference. --001a1134fdb270473805138bb92a Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
On S= un, Apr 12, 2015 at 9:09 AM, Al Viro <viro@zeniv.linux.org.uk>= ; wrote:
On Sun, Apr 12, 2015 a= t 12:42:35PM -0000, Eric Van Hensbergen wrote:

> In other words, it only uses the open fd to derrive a path and then > executes the getattr off of that path.=C2=A0 If that path no longer ex= ists
> (because of unlink or remove) then you are hosed.=C2=A0 In my understa= nding, the
> "work around" I suppose is the so-called 'silly renaming= ' where
> remove/unlink simply does a rename until all open instances are closed= .

What do you mean, "no longer exists"?=C2=A0 Don't conf= use path with pathname -
it's a mount,dentry pair.=C2=A0 And dentry in question bloody well ough= t to still
have the FID associated with it - you shouldn't use the same FID for TREMOVE and for TREAD/TWRITE.=C2=A0

Quite = right, the fid is still there, I just don't have an easy way to get at = it. =C2=A0vfs_file operations have a direct notion of the fid because they = cache it in the struct file private data. =C2=A0dentries have several fids = associated with them and stored in thier private data, so I've got to &= quot;guess" the right one.=C2=A0 In most cases this probably won't= cause a problem, but it just feels a bit off.

The= re was a thread a few years back where Miklos was arguing for fstat to pass= through struct file information since the the fd given the fstat had a fil= e structure associated with it (which in 9p's case has a direct pointer= to the "right" fid):=C2=A0
=C2=A0 =C2=A0 http://lwn.net/Articles/251228/=C2=A0

In any case, I've drafted a quick patch which ta= kes the approach of searching the dentry fid list for the fid, but it doesn= 't feel like the right answer and I'm fairly certain I need to iter= ate on it a few times to make sure I haven't hosed something up.
<= div>=C2=A0
TREMOVE clunks the FID passed to it; on some servers you really have no choice - server discards the file completel= y
and on any FID that used to refer to it you get an error from that point on= .
On those you'd really have to do something like sillyrename - the only<= br> way to keep IO going for a file sitting on such server is to have it
visible somewhere.=C2=A0 Normal fs(4) is that way; e.g. u9fs(4) isn't -= there
FID maps to opened file descriptor on host and TREMOVE on another FID
doesn't break it, as long as host supports IO on opened-but-unlinked fi= les.
I don't remember where qemu 9pfs falls in that respect, but I'd exp= ect it
to be more like u9fs...


Sort of, the servers in kvmtool and qe= mu (and diod) have a fid with the open handle.=C2=A0 However, all of them s= eem to implement getattr assuming they have to re-walk to the file.=C2=A0 I= n order to test my aforementioned changes to the client, I also did a quick= patch to kvmtool which checks and sees if the fid it receives has an fd an= d just uses fstat instead of lstat.=C2=A0 Patch is here at the moment, I= 9;ll send upstream once I'm happy with the client side changes and look= into creating a patch for qemu/diod:
=C2=A0
Which FID are you passing to server on unlink()?


Unlink/remove look to be getting a pro= per fid (in other words, not using the one that is open).=C2=A0 The problem= is that getattr is using a reference fid (an open fid that's already w= alked to the name).=C2=A0 From a protocol semantics perspective the fixes m= entioned above probably don't help that we may have some dangling unope= n fids pointing at a name that is no longer valid, but that is just a conse= quence of the imperfect nature of the mapping of dentries to fids and I'= ;m not sure it does any harm.=C2=A0 A trace from the original problem on di= od (which appears to not implement unlink and is falling back to remove).

diod: P9_TWALK t= ag 1 fid 1 newfid 2 nwname 1 'foobar'

diod: P9_RLERROR= tag 1 ecode 2

diod: P9_TWALK t= ag 1 fid 1 newfid 2 nwname 0

diod: P9_RWALK t= ag 1 nwqid 0e

diod: P9_TLCREAT= E tag 1 fid 2 name 'foobar' flags 0x8042 mode 0100644 gid 0

diod: P9_RLCREAT= E tag 1 qid (000000000012524b 0 '') iounit 0

diod: P9_TWALK tag 1 fid 1 newfid 3 nwname 1 'foobar'

diod: P9_RWALK t= ag 1 nwqid 1 (000000000012524b 0 '')

diod: P9_TGETATT= R tag 1 fid 3 request_mask 0x17ff

diod: P9_RGETATT= R tag 1 valid 0x17ff qid (000000000012524b 0 '') mode 0100644 uid 0 g= id 0 nlink 1 rdev 0 size 0 blksize 4096 blocks 0 atime Mon Apr=C2=A0 6 11:11:08 2015 mti= me Mon Apr=C2=A0 6 11:11:08 2015 ctime Mon Apr=C2=A0 6 11:11:08 2015 btime = X gen 0 data_version X

diod: P9_TUNLINK= AT tag 1 dirfid 1 name 'foobar' flags 0

diod: P9_RLERROR= tag 1 ecode 95

diod: P9_TWALK t= ag 1 fid 3 newfid 4 nwname 0

diod: P9_RWALK t= ag 1 nwqid 0

diod: P9_TREMOVE= tag 1 fid 4

diod: P9_RREMOVE= tag 1

diod: P9_TGETATT= R tag 1 fid 3 request_mask 0x3fff

diod: P9_RLERROR= tag 1 ecode 2

diod: P9_TCLUNK = tag 1 fid 2

diod: P9_RCLUNK = tag 1

diod: P9_TCLUNK = tag 1 fid 3

diod: P9_RCLUNK = tag 1


The client cloning 3 to 4 = before the remove seems a bit unnecessary, but is probably done in the case= that the remove fails on the server so that we still have a dentry referen= ce.
--001a1134fdb270473805138bb92a--