* Re: [fuse-devel] Changes in 4.7. [not found] <CANXojcwuLVpyqkXxcUhKdG=nDEp-XspR4PtDerpoqO2FmMLp5w@mail.gmail.com> @ 2016-05-31 7:17 ` Miklos Szeredi 2016-05-31 10:57 ` Stef Bon [not found] ` <nijg09$6k0$1@ger.gmane.org> 0 siblings, 2 replies; 11+ messages in thread From: Miklos Szeredi @ 2016-05-31 7:17 UTC (permalink / raw) To: Stef Bon; +Cc: fuse-devel, linux-fsdevel, linux-kernel On Tue, May 31, 2016 at 9:08 AM, Stef Bon <stefbon@gmail.com> wrote: > Hi, > > I've read some news about the 4.7 kernel : > > "And in particular, if > you're a low-level filesystem person, or involved in other ways in > path component lookup (security layer etc), go check that everything > looks ok, and if your filesystem isn't one that does parallel lookups > or readdirs yet (because locking issues), take a look at that too." > > https://lkml.org/lkml/2016/5/29/77 > > Does this have consequenses for fuse? > I know that with some filesystems I've written the readdir call locks > the directory exclusive. The problem would be if the fuse filesystem assumed serialized lookup/readdir and they don't do any locking themselves. We probably need to conditionally re-add the lookup/readdir serialization to the fuse kernel module, with an INIT flag to explicitly enable parallel readdir and lookup (i.e. disable the serialization). Thanks, Miklos ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [fuse-devel] Changes in 4.7. 2016-05-31 7:17 ` [fuse-devel] Changes in 4.7 Miklos Szeredi @ 2016-05-31 10:57 ` Stef Bon 2016-05-31 11:09 ` Miklos Szeredi [not found] ` <nijg09$6k0$1@ger.gmane.org> 1 sibling, 1 reply; 11+ messages in thread From: Stef Bon @ 2016-05-31 10:57 UTC (permalink / raw) To: linux-fsdevel; +Cc: fuse-devel 2016-05-31 9:17 GMT+02:00 Miklos Szeredi <miklos@szeredi.hu>: > On Tue, May 31, 2016 at 9:08 AM, Stef Bon <stefbon@gmail.com> wrote: >> Hi, >> > > The problem would be if the fuse filesystem assumed serialized > lookup/readdir and they don't do any locking themselves. > Yes of course. > We probably need to conditionally re-add the lookup/readdir > serialization to the fuse kernel module, with an INIT flag to > explicitly enable parallel readdir and lookup (i.e. disable the > serialization). Yes, via a init flag, where serialization will be the default to provide backwards compatibility. You say "re-add". Has this been in the kernel module before? Stef ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [fuse-devel] Changes in 4.7. 2016-05-31 10:57 ` Stef Bon @ 2016-05-31 11:09 ` Miklos Szeredi 0 siblings, 0 replies; 11+ messages in thread From: Miklos Szeredi @ 2016-05-31 11:09 UTC (permalink / raw) To: Stef Bon; +Cc: linux-fsdevel, fuse-devel On Tue, May 31, 2016 at 12:57 PM, Stef Bon <stefbon@gmail.com> wrote: > 2016-05-31 9:17 GMT+02:00 Miklos Szeredi <miklos@szeredi.hu>: >> On Tue, May 31, 2016 at 9:08 AM, Stef Bon <stefbon@gmail.com> wrote: >>> Hi, >>> > >> >> The problem would be if the fuse filesystem assumed serialized >> lookup/readdir and they don't do any locking themselves. >> > > Yes of course. > >> We probably need to conditionally re-add the lookup/readdir >> serialization to the fuse kernel module, with an INIT flag to >> explicitly enable parallel readdir and lookup (i.e. disable the >> serialization). > > Yes, via a init flag, where serialization will be the default to > provide backwards compatibility. > You say "re-add". Has this been in the kernel module before? It has been in the VFS before. Now we need to re-add it into the fuse module. Thanks, Miklos ^ permalink raw reply [flat|nested] 11+ messages in thread
[parent not found: <nijg09$6k0$1@ger.gmane.org>]
[parent not found: <CAJfpegvJoSK6fQEGWj_uEQF8q2jYmdqUyKy1-m5DxFMUFc0rEg@mail.gmail.com>]
[parent not found: <CAJfpeguV3H_oC=5FM3G8tVsrY7Fiy2LX0JKttFBGQWN+SC6_YQ@mail.gmail.com>]
[parent not found: <87shwy8b38.fsf@thinkpad.rath.org>]
* Re: [fuse-devel] Changes in 4.7. [not found] ` <87shwy8b38.fsf@thinkpad.rath.org> @ 2016-05-31 16:25 ` Stef Bon 2016-05-31 17:22 ` Stef Bon 2016-05-31 17:44 ` Al Viro 0 siblings, 2 replies; 11+ messages in thread From: Stef Bon @ 2016-05-31 16:25 UTC (permalink / raw) To: fuse-devel, linux-fsdevel Hi, I've been thinking about the non serialized readdirs. I do not understand. Readdirs have to be serialized, since the offset of the next readdir (belonging to the opendir) is known when the current readdir is finished: "start where current left". So it means probably that lookup (which in fact is only a read only operation) of a name in a directory can be done while a readdir is active on the same directory. Or something else? Two readdirs (or more) belonging to different opendirs active on the same directory? Stef ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [fuse-devel] Changes in 4.7. 2016-05-31 16:25 ` Stef Bon @ 2016-05-31 17:22 ` Stef Bon 2016-05-31 17:44 ` Al Viro 1 sibling, 0 replies; 11+ messages in thread From: Stef Bon @ 2016-05-31 17:22 UTC (permalink / raw) To: fuse-devel, linux-fsdevel Hi, the article explains a lot: https://lwn.net/Articles/685108/ But readdirs are still serialized, in contrary of what Linus Torvalds writes in this message, It's only about lookups in parallel, where the case of two different lookups of the same name in parent is taken care of. Stef ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [fuse-devel] Changes in 4.7. 2016-05-31 16:25 ` Stef Bon 2016-05-31 17:22 ` Stef Bon @ 2016-05-31 17:44 ` Al Viro 2016-05-31 18:44 ` Stef Bon 1 sibling, 1 reply; 11+ messages in thread From: Al Viro @ 2016-05-31 17:44 UTC (permalink / raw) To: Stef Bon; +Cc: fuse-devel, linux-fsdevel On Tue, May 31, 2016 at 06:25:24PM +0200, Stef Bon wrote: > I've been thinking about the non serialized readdirs. I do not understand. > Readdirs have to be serialized, since the offset of the next readdir > (belonging to the opendir) is known when the current readdir is > finished: > "start where current left". They are serialized per struct file (and so'd lseek() on them, for that matter). So the state that is associated with an opened file is just fine; it's modifiable state associated with directory itself, and shared between all opened file that would be a problem. IOW, they can do readdir in parallel exactly in the cases when lseek done by one of them would not affect another. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [fuse-devel] Changes in 4.7. 2016-05-31 17:44 ` Al Viro @ 2016-05-31 18:44 ` Stef Bon 2016-05-31 20:29 ` Al Viro 0 siblings, 1 reply; 11+ messages in thread From: Stef Bon @ 2016-05-31 18:44 UTC (permalink / raw) To: Al Viro; +Cc: fuse-devel, linux-fsdevel 2016-05-31 19:44 GMT+02:00 Al Viro <viro@zeniv.linux.org.uk>: > On Tue, May 31, 2016 at 06:25:24PM +0200, Stef Bon wrote: > >> I've been thinking about the non serialized readdirs. I do not understand. >> Readdirs have to be serialized, since the offset of the next readdir >> (belonging to the opendir) is known when the current readdir is >> finished: >> "start where current left". > > They are serialized per struct file (and so'd lseek() on them, for that > matter). So the state that is associated with an opened file is just > fine; it's modifiable state associated with directory itself, and shared > between all opened file that would be a problem. I'm really sorry but what do mean with struct file? We're talking about directories and I do not understand what the meaning is of the struct file here. > > IOW, they can do readdir in parallel exactly in the cases when lseek > done by one of them would not affect another. And when lseek does not affect another? Is this right: When there are no changes in the entries, no entries are created or removed (or moved away)?? (which probably means the cache of names in the directory is uptodate). Stef ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [fuse-devel] Changes in 4.7. 2016-05-31 18:44 ` Stef Bon @ 2016-05-31 20:29 ` Al Viro 2016-06-01 12:32 ` Stef Bon 0 siblings, 1 reply; 11+ messages in thread From: Al Viro @ 2016-05-31 20:29 UTC (permalink / raw) To: Stef Bon; +Cc: fuse-devel, linux-fsdevel On Tue, May 31, 2016 at 08:44:34PM +0200, Stef Bon wrote: > > IOW, they can do readdir in parallel exactly in the cases when lseek > > done by one of them would not affect another. > > And when lseek does not affect another? Is this right: When there are > no changes in the entries, no entries are created or removed > (or moved away)?? > (which probably means the cache of names in the directory is uptodate). No. On any Unix, since before the transition to PDP-11, there is a distinction between file descriptor and opened file. There are three layers of objects in there: 1) descriptor 2) opened file 3) filesystem object Each has its own set of properties and system calls to manipulate those. open(2) creates new objects in layers 1 and 2. dup(2) acts in layer 1 alone - you get a new descriptor refering to the same opened file (unfortunate name, that - something like "open IO channel" would be less confusing). fork(2) also acts only on layer 1. The primary effect of close(2) is also in layer 1, but the side effects might reach into layers 2. lseek(2) is a layer 2 operation. Current IO position is a property of an opened file, *not* of a descriptor or of underlying filesystem object. Had been, since the moment they'd implemented redirects. For a TTY it doesn't matter, but think what happens when you do (date; ls) > foo. shell opens the file we are redirecting to and uses dup2() (or close() + dup(), for that matter) to make descriptor 1 (stdout) point to it. Then date(1) writes a string to its descriptor 1 (inherited from shell). Then ls(1) does the same. Both pieces of output end up written to foo; so far, so good, but you want the output of ls(1) start *after* the output of date(1). In other words, current IO position should be shared across fork() and dup(). OTOH, it obviously can't be a property of underlying filesystem object - you do _not_ want e.g. grep qsort *.[ch] from one terminal to play havoc on cc a.c from another. Each time you call open(2) you get a new opened file (IO channel, whatever you call it) *and* a new descriptor refering to it. fork()/dup()/dup2()/close() act upon descriptors; so does exit(), for that matter. When all references to an opened file disappear, that opened file gets closed. It is a common effect of close(2), but it's a separate event; moreover, that event might have further side effects - if a file had been opened and unlinked, closing the opened file in question might trigger the destruction of underlying filesystem object, provided there's no surviving hardlinks to it. read()/write()/lseek() act upon the opened file; it is specified by descriptor, but the effects are the same whichever descriptor refering to that opened file had been used. On the other had, the effect *does* depend upon the opened file being involved, not just the underlying filesystem object. All of the above applies to directories. Well, almost - you get getdents(2) instead of read(2) and no analogue of write(2). The notion of the current IO position, desciptor vs. opened file distinction, difference between open() + dup() and open() + open() - all of that is identical to the situation with regular files. "struct file" is a fairly common name for the structure representing an opened file (regardless of the file type). On all kind of Unices, Linux included... ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [fuse-devel] Changes in 4.7. 2016-05-31 20:29 ` Al Viro @ 2016-06-01 12:32 ` Stef Bon 2016-06-01 13:52 ` Al Viro 0 siblings, 1 reply; 11+ messages in thread From: Stef Bon @ 2016-06-01 12:32 UTC (permalink / raw) To: Al Viro; +Cc: fuse-devel, linux-fsdevel 2016-05-31 22:29 GMT+02:00 Al Viro <viro@zeniv.linux.org.uk>: > All of the above applies to directories. Well, almost - you get getdents(2) > instead of read(2) and no analogue of write(2). The notion of the current > IO position, desciptor vs. opened file distinction, difference between > open() + dup() and open() + open() - all of that is identical to the situation > with regular files. > > "struct file" is a fairly common name for the structure representing an > opened file (regardless of the file type). On all kind of Unices, Linux > included... I understand that a directory is simular to a file, and the struct file also applies to a directory. Thanks a lot for your detailed explanation! When does a lseek not affect another? Does this depend on how the filesystem deals with a directory right? How is it stored? When a directory is nothing more than a linked list, where new entries are appended at the end, seeking through the linked list will not get mixed up compared to using another method, like I use skiplists (and only that) for directories. Stef ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [fuse-devel] Changes in 4.7. 2016-06-01 12:32 ` Stef Bon @ 2016-06-01 13:52 ` Al Viro 2016-06-01 14:44 ` Stef Bon 0 siblings, 1 reply; 11+ messages in thread From: Al Viro @ 2016-06-01 13:52 UTC (permalink / raw) To: Stef Bon; +Cc: fuse-devel, linux-fsdevel On Wed, Jun 01, 2016 at 02:32:46PM +0200, Stef Bon wrote: > I understand that a directory is simular to a file, and the struct > file also applies > to a directory. > Thanks a lot for your detailed explanation! > > When does a lseek not affect another? Does this depend on how the > filesystem deals with > a directory right? How is it stored? When a directory is nothing more > than a linked list, where > new entries are appended at the end, seeking through the linked list > will not get mixed up > compared to using another method, like I use skiplists (and only > that) for directories. No, it has nothing to do with the way directory is stored, etc. After int fd1 = open("foo", O_DIRECTORY); int fd2 = dup(fd1); int fd3 = open("foo", O_DIRECTORY); lseek() on fd1 and fd2 manipulate the same object; that on fd3 is independent. Current IO position, both for regular files and directories, is a property of an object created by open(); dup()/dup2()/fork() create aliases for those objects. getdents(2) is serialized on per-open() basis; if two descriptors are aliases ultimately coming from the same open() call, the calls of getdents() on them will be treated as if they were sequential calls of getdents() on the same descriptor - each call will read a new chunk of directory. If descriptors result from separate open() calls, getdents() on one of them has no effect on getdents() on another. It's exactly the same as for regular files - if you have int fd1 = open("bar", 0); // first channel opened int fd2 = dup(fd1); // fd2 refers to the same channel int fd3 = open("bar", 0); // second channel opened char c1, c2, c3; read(fd1, &c1, 1); // offset of the first channel goes 0 -> 1 read(fd2, &c2, 1); // offset of the first channel goes 1 -> 2 read(fd3, &c3, 1); // offset of the second channel goes 0 -> 1 c1 and c3 will contain the first byte of our file and c2 - the second one. If you continue that with lseek(fd2, 0, SEEK_SET);// offset of the first channel goes 2 -> 0 read(fd1, &c1, 1); c1 will be the first byte. If that lseek() had been done to fd3 instead of fd2, c1 would be the third byte, since the offset in the first channel would've been unaffected by the operation on the second one. For regular files, the kernel serializes read()/write()/lseek() done on descriptors aliasing each other. Now it does the same for getdents()/lseek() of directories. >From the filesystem point of view, you might see two getdents() called in parallel only if their results should be unaffected by the order of operations. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [fuse-devel] Changes in 4.7. 2016-06-01 13:52 ` Al Viro @ 2016-06-01 14:44 ` Stef Bon 0 siblings, 0 replies; 11+ messages in thread From: Stef Bon @ 2016-06-01 14:44 UTC (permalink / raw) To: Al Viro; +Cc: fuse-devel, linux-fsdevel 2016-06-01 15:52 GMT+02:00 Al Viro <viro@zeniv.linux.org.uk>: > For regular files, the kernel serializes read()/write()/lseek() done on > descriptors aliasing each other. Now it does the same for getdents()/lseek() > of directories. > > From the filesystem point of view, you might see two getdents() called > in parallel only if their results should be unaffected by the order of > operations. Ah right! Now I understand. Thanks a lot for your explanation! Of course when a second descriptor refers the same object, operations like read/lseek have to be serialized. If not strange things will happen. I was thinking on a totally different level. Stef ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-06-01 14:44 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <CANXojcwuLVpyqkXxcUhKdG=nDEp-XspR4PtDerpoqO2FmMLp5w@mail.gmail.com> 2016-05-31 7:17 ` [fuse-devel] Changes in 4.7 Miklos Szeredi 2016-05-31 10:57 ` Stef Bon 2016-05-31 11:09 ` Miklos Szeredi [not found] ` <nijg09$6k0$1@ger.gmane.org> [not found] ` <CAJfpegvJoSK6fQEGWj_uEQF8q2jYmdqUyKy1-m5DxFMUFc0rEg@mail.gmail.com> [not found] ` <CAJfpeguV3H_oC=5FM3G8tVsrY7Fiy2LX0JKttFBGQWN+SC6_YQ@mail.gmail.com> [not found] ` <87shwy8b38.fsf@thinkpad.rath.org> 2016-05-31 16:25 ` Stef Bon 2016-05-31 17:22 ` Stef Bon 2016-05-31 17:44 ` Al Viro 2016-05-31 18:44 ` Stef Bon 2016-05-31 20:29 ` Al Viro 2016-06-01 12:32 ` Stef Bon 2016-06-01 13:52 ` Al Viro 2016-06-01 14:44 ` Stef Bon
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.