All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [fuse-devel] Changes in 4.7.
       [not found] <CANXojcwuLVpyqkXxcUhKdG=nDEp-XspR4PtDerpoqO2FmMLp5w@mail.gmail.com>
@ 2016-05-31  7:17 ` Miklos Szeredi
  2016-05-31 10:57   ` Stef Bon
       [not found]   ` <nijg09$6k0$1@ger.gmane.org>
  0 siblings, 2 replies; 11+ messages in thread
From: Miklos Szeredi @ 2016-05-31  7:17 UTC (permalink / raw)
  To: Stef Bon; +Cc: fuse-devel, linux-fsdevel, linux-kernel

On Tue, May 31, 2016 at 9:08 AM, Stef Bon <stefbon@gmail.com> wrote:
> Hi,
>
> I've read some news about the 4.7 kernel :
>
> "And in particular, if
> you're a low-level filesystem person, or involved in other ways in
> path component lookup (security layer etc), go check that everything
> looks ok, and if your filesystem isn't one that does parallel lookups
> or readdirs yet (because locking issues), take a look at that too."
>
> https://lkml.org/lkml/2016/5/29/77
>
> Does this have consequenses for fuse?
> I know that with some filesystems I've written the readdir call locks
> the directory exclusive.

The problem would be if the fuse filesystem assumed serialized
lookup/readdir and they don't do any locking themselves.

We probably need to conditionally re-add the lookup/readdir
serialization to the fuse kernel module, with an INIT flag to
explicitly enable parallel readdir and lookup (i.e. disable the
serialization).

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
  2016-05-31  7:17 ` [fuse-devel] Changes in 4.7 Miklos Szeredi
@ 2016-05-31 10:57   ` Stef Bon
  2016-05-31 11:09     ` Miklos Szeredi
       [not found]   ` <nijg09$6k0$1@ger.gmane.org>
  1 sibling, 1 reply; 11+ messages in thread
From: Stef Bon @ 2016-05-31 10:57 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: fuse-devel

2016-05-31 9:17 GMT+02:00 Miklos Szeredi <miklos@szeredi.hu>:
> On Tue, May 31, 2016 at 9:08 AM, Stef Bon <stefbon@gmail.com> wrote:
>> Hi,
>>

>
> The problem would be if the fuse filesystem assumed serialized
> lookup/readdir and they don't do any locking themselves.
>

Yes of course.

> We probably need to conditionally re-add the lookup/readdir
> serialization to the fuse kernel module, with an INIT flag to
> explicitly enable parallel readdir and lookup (i.e. disable the
> serialization).

Yes, via a init flag, where serialization will be the default to
provide backwards compatibility.
You say "re-add". Has this been in the kernel module before?

Stef

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
  2016-05-31 10:57   ` Stef Bon
@ 2016-05-31 11:09     ` Miklos Szeredi
  0 siblings, 0 replies; 11+ messages in thread
From: Miklos Szeredi @ 2016-05-31 11:09 UTC (permalink / raw)
  To: Stef Bon; +Cc: linux-fsdevel, fuse-devel

On Tue, May 31, 2016 at 12:57 PM, Stef Bon <stefbon@gmail.com> wrote:
> 2016-05-31 9:17 GMT+02:00 Miklos Szeredi <miklos@szeredi.hu>:
>> On Tue, May 31, 2016 at 9:08 AM, Stef Bon <stefbon@gmail.com> wrote:
>>> Hi,
>>>
>
>>
>> The problem would be if the fuse filesystem assumed serialized
>> lookup/readdir and they don't do any locking themselves.
>>
>
> Yes of course.
>
>> We probably need to conditionally re-add the lookup/readdir
>> serialization to the fuse kernel module, with an INIT flag to
>> explicitly enable parallel readdir and lookup (i.e. disable the
>> serialization).
>
> Yes, via a init flag, where serialization will be the default to
> provide backwards compatibility.
> You say "re-add". Has this been in the kernel module before?

It has been in the VFS before.  Now we need to re-add it into the fuse module.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
       [not found]         ` <87shwy8b38.fsf@thinkpad.rath.org>
@ 2016-05-31 16:25           ` Stef Bon
  2016-05-31 17:22             ` Stef Bon
  2016-05-31 17:44             ` Al Viro
  0 siblings, 2 replies; 11+ messages in thread
From: Stef Bon @ 2016-05-31 16:25 UTC (permalink / raw)
  To: fuse-devel, linux-fsdevel

Hi,

I've been thinking about the non serialized readdirs. I do not understand.
Readdirs have to be serialized, since the offset of the next readdir
(belonging to the opendir) is known when the current readdir is
finished:
"start where current left".

So it means probably that lookup (which in fact is only a read only
operation) of a name in a directory can be done while a readdir
is active on the same directory.

Or something else? Two readdirs (or more) belonging to different
opendirs active on the same directory?

Stef

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
  2016-05-31 16:25           ` Stef Bon
@ 2016-05-31 17:22             ` Stef Bon
  2016-05-31 17:44             ` Al Viro
  1 sibling, 0 replies; 11+ messages in thread
From: Stef Bon @ 2016-05-31 17:22 UTC (permalink / raw)
  To: fuse-devel, linux-fsdevel

Hi,

the article explains a lot:

https://lwn.net/Articles/685108/

But readdirs are still serialized, in contrary of what Linus Torvalds
writes in this message, It's only about lookups in parallel, where the
case
of two different lookups of the same name in parent is taken care of.

 Stef

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
  2016-05-31 16:25           ` Stef Bon
  2016-05-31 17:22             ` Stef Bon
@ 2016-05-31 17:44             ` Al Viro
  2016-05-31 18:44               ` Stef Bon
  1 sibling, 1 reply; 11+ messages in thread
From: Al Viro @ 2016-05-31 17:44 UTC (permalink / raw)
  To: Stef Bon; +Cc: fuse-devel, linux-fsdevel

On Tue, May 31, 2016 at 06:25:24PM +0200, Stef Bon wrote:

> I've been thinking about the non serialized readdirs. I do not understand.
> Readdirs have to be serialized, since the offset of the next readdir
> (belonging to the opendir) is known when the current readdir is
> finished:
> "start where current left".

They are serialized per struct file (and so'd lseek() on them, for that
matter).  So the state that is associated with an opened file is just
fine; it's modifiable state associated with directory itself, and shared
between all opened file that would be a problem.

IOW, they can do readdir in parallel exactly in the cases when lseek
done by one of them would not affect another.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
  2016-05-31 17:44             ` Al Viro
@ 2016-05-31 18:44               ` Stef Bon
  2016-05-31 20:29                 ` Al Viro
  0 siblings, 1 reply; 11+ messages in thread
From: Stef Bon @ 2016-05-31 18:44 UTC (permalink / raw)
  To: Al Viro; +Cc: fuse-devel, linux-fsdevel

2016-05-31 19:44 GMT+02:00 Al Viro <viro@zeniv.linux.org.uk>:
> On Tue, May 31, 2016 at 06:25:24PM +0200, Stef Bon wrote:
>
>> I've been thinking about the non serialized readdirs. I do not understand.
>> Readdirs have to be serialized, since the offset of the next readdir
>> (belonging to the opendir) is known when the current readdir is
>> finished:
>> "start where current left".
>
> They are serialized per struct file (and so'd lseek() on them, for that
> matter).  So the state that is associated with an opened file is just
> fine; it's modifiable state associated with directory itself, and shared
> between all opened file that would be a problem.

I'm really sorry but what do mean with struct file? We're talking
about directories and I do not
understand what the meaning is of the struct file here.

>
> IOW, they can do readdir in parallel exactly in the cases when lseek
> done by one of them would not affect another.

And when lseek does not affect another? Is this right: When there are
no changes in the entries, no entries are created or removed
(or moved away)??
(which probably means the cache of names in the directory is uptodate).

Stef

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
  2016-05-31 18:44               ` Stef Bon
@ 2016-05-31 20:29                 ` Al Viro
  2016-06-01 12:32                   ` Stef Bon
  0 siblings, 1 reply; 11+ messages in thread
From: Al Viro @ 2016-05-31 20:29 UTC (permalink / raw)
  To: Stef Bon; +Cc: fuse-devel, linux-fsdevel

On Tue, May 31, 2016 at 08:44:34PM +0200, Stef Bon wrote:

> > IOW, they can do readdir in parallel exactly in the cases when lseek
> > done by one of them would not affect another.
> 
> And when lseek does not affect another? Is this right: When there are
> no changes in the entries, no entries are created or removed
> (or moved away)??
> (which probably means the cache of names in the directory is uptodate).

No.  On any Unix, since before the transition to PDP-11, there is a distinction
between file descriptor and opened file.  There are three layers of objects
in there:
1)	descriptor
2)	opened file
3)	filesystem object
Each has its own set of properties and system calls to manipulate those.
open(2) creates new objects in layers 1 and 2.  dup(2) acts in layer 1 alone -
you get a new descriptor refering to the same opened file (unfortunate name,
that - something like "open IO channel" would be less confusing).  fork(2)
also acts only on layer 1.  The primary effect of close(2) is also in layer
1, but the side effects might reach into layers 2.

lseek(2) is a layer 2 operation.  Current IO position is a property of
an opened file, *not* of a descriptor or of underlying filesystem object.
Had been, since the moment they'd implemented redirects.  For a TTY it
doesn't matter, but think what happens when you do (date; ls) > foo.
shell opens the file we are redirecting to and uses dup2() (or close() + dup(),
for that matter) to make descriptor 1 (stdout) point to it.  Then
date(1) writes a string to its descriptor 1 (inherited from shell).  Then
ls(1) does the same.  Both pieces of output end up written to foo; so far,
so good, but you want the output of ls(1) start *after* the output of date(1).
In other words, current IO position should be shared across fork() and dup().

OTOH, it obviously can't be a property of underlying filesystem object -
you do _not_ want e.g. grep qsort *.[ch] from one terminal to play havoc on
cc a.c from another.

Each time you call open(2) you get a new opened file (IO channel, whatever you
call it) *and* a new descriptor refering to it.  fork()/dup()/dup2()/close()
act upon descriptors; so does exit(), for that matter.  When all references
to an opened file disappear, that opened file gets closed.  It is a common
effect of close(2), but it's a separate event; moreover, that event might have
further side effects - if a file had been opened and unlinked, closing the
opened file in question might trigger the destruction of underlying filesystem
object, provided there's no surviving hardlinks to it.

read()/write()/lseek() act upon the opened file; it is specified by descriptor,
but the effects are the same whichever descriptor refering to that opened
file had been used.  On the other had, the effect *does* depend upon the
opened file being involved, not just the underlying filesystem object.

All of the above applies to directories.  Well, almost - you get getdents(2)
instead of read(2) and no analogue of write(2).  The notion of the current
IO position, desciptor vs. opened file distinction, difference between
open() + dup() and open() + open() - all of that is identical to the situation
with regular files.

"struct file" is a fairly common name for the structure representing an
opened file (regardless of the file type).  On all kind of Unices, Linux
included...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
  2016-05-31 20:29                 ` Al Viro
@ 2016-06-01 12:32                   ` Stef Bon
  2016-06-01 13:52                     ` Al Viro
  0 siblings, 1 reply; 11+ messages in thread
From: Stef Bon @ 2016-06-01 12:32 UTC (permalink / raw)
  To: Al Viro; +Cc: fuse-devel, linux-fsdevel

2016-05-31 22:29 GMT+02:00 Al Viro <viro@zeniv.linux.org.uk>:
> All of the above applies to directories.  Well, almost - you get getdents(2)
> instead of read(2) and no analogue of write(2).  The notion of the current
> IO position, desciptor vs. opened file distinction, difference between
> open() + dup() and open() + open() - all of that is identical to the situation
> with regular files.
>
> "struct file" is a fairly common name for the structure representing an
> opened file (regardless of the file type).  On all kind of Unices, Linux
> included...

I understand that a directory is simular to a file, and the struct
file also applies
to a directory.
Thanks a lot for your detailed explanation!

When does a lseek not affect another? Does this depend on how the
filesystem deals with
a directory right? How is it stored? When a directory is nothing more
than a linked list, where
new entries are appended at the end, seeking through the linked list
will not get mixed up
compared to  using another method, like I use skiplists (and only
that) for directories.

Stef

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
  2016-06-01 12:32                   ` Stef Bon
@ 2016-06-01 13:52                     ` Al Viro
  2016-06-01 14:44                       ` Stef Bon
  0 siblings, 1 reply; 11+ messages in thread
From: Al Viro @ 2016-06-01 13:52 UTC (permalink / raw)
  To: Stef Bon; +Cc: fuse-devel, linux-fsdevel

On Wed, Jun 01, 2016 at 02:32:46PM +0200, Stef Bon wrote:

> I understand that a directory is simular to a file, and the struct
> file also applies
> to a directory.
> Thanks a lot for your detailed explanation!
> 
> When does a lseek not affect another? Does this depend on how the
> filesystem deals with
> a directory right? How is it stored? When a directory is nothing more
> than a linked list, where
> new entries are appended at the end, seeking through the linked list
> will not get mixed up
> compared to  using another method, like I use skiplists (and only
> that) for directories.

No, it has nothing to do with the way directory is stored, etc.  After
	int fd1 = open("foo", O_DIRECTORY);
	int fd2 = dup(fd1);
	int fd3 = open("foo", O_DIRECTORY);
lseek() on fd1 and fd2 manipulate the same object; that on fd3 is independent.
Current IO position, both for regular files and directories, is a property
of an object created by open(); dup()/dup2()/fork() create aliases for those
objects.  getdents(2) is serialized on per-open() basis; if two descriptors
are aliases ultimately coming from the same open() call, the calls of
getdents() on them will be treated as if they were sequential calls of
getdents() on the same descriptor - each call will read a new chunk of
directory.  If descriptors result from separate open() calls, getdents() on
one of them has no effect on getdents() on another.

It's exactly the same as for regular files - if you have
	int fd1 = open("bar", 0);	// first channel opened
	int fd2 = dup(fd1);		// fd2 refers to the same channel
	int fd3 = open("bar", 0);	// second channel opened
	char c1, c2, c3;
	read(fd1, &c1, 1);	// offset of the first channel goes 0 -> 1
	read(fd2, &c2, 1);	// offset of the first channel goes 1 -> 2
	read(fd3, &c3, 1);	// offset of the second channel goes 0 -> 1
c1 and c3 will contain the first byte of our file and c2 - the second one.
If you continue that with
	lseek(fd2, 0, SEEK_SET);// offset of the first channel goes 2 -> 0
	read(fd1, &c1, 1);
c1 will be the first byte.  If that lseek() had been done to fd3 instead of
fd2, c1 would be the third byte, since the offset in the first channel would've
been unaffected by the operation on the second one.

For regular files, the kernel serializes read()/write()/lseek() done on
descriptors aliasing each other.  Now it does the same for getdents()/lseek()
of directories.

>From the filesystem point of view, you might see two getdents() called
in parallel only if their results should be unaffected by the order of
operations.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [fuse-devel] Changes in 4.7.
  2016-06-01 13:52                     ` Al Viro
@ 2016-06-01 14:44                       ` Stef Bon
  0 siblings, 0 replies; 11+ messages in thread
From: Stef Bon @ 2016-06-01 14:44 UTC (permalink / raw)
  To: Al Viro; +Cc: fuse-devel, linux-fsdevel

2016-06-01 15:52 GMT+02:00 Al Viro <viro@zeniv.linux.org.uk>:
> For regular files, the kernel serializes read()/write()/lseek() done on
> descriptors aliasing each other.  Now it does the same for getdents()/lseek()
> of directories.
>
> From the filesystem point of view, you might see two getdents() called
> in parallel only if their results should be unaffected by the order of
> operations.

Ah right! Now I understand. Thanks a lot for your explanation!
Of course when a second descriptor refers the same object, operations
like read/lseek have to be serialized.
If not strange things will happen.
I was thinking on a totally different level.

Stef

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-06-01 14:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CANXojcwuLVpyqkXxcUhKdG=nDEp-XspR4PtDerpoqO2FmMLp5w@mail.gmail.com>
2016-05-31  7:17 ` [fuse-devel] Changes in 4.7 Miklos Szeredi
2016-05-31 10:57   ` Stef Bon
2016-05-31 11:09     ` Miklos Szeredi
     [not found]   ` <nijg09$6k0$1@ger.gmane.org>
     [not found]     ` <CAJfpegvJoSK6fQEGWj_uEQF8q2jYmdqUyKy1-m5DxFMUFc0rEg@mail.gmail.com>
     [not found]       ` <CAJfpeguV3H_oC=5FM3G8tVsrY7Fiy2LX0JKttFBGQWN+SC6_YQ@mail.gmail.com>
     [not found]         ` <87shwy8b38.fsf@thinkpad.rath.org>
2016-05-31 16:25           ` Stef Bon
2016-05-31 17:22             ` Stef Bon
2016-05-31 17:44             ` Al Viro
2016-05-31 18:44               ` Stef Bon
2016-05-31 20:29                 ` Al Viro
2016-06-01 12:32                   ` Stef Bon
2016-06-01 13:52                     ` Al Viro
2016-06-01 14:44                       ` Stef Bon

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.